Stock prediction problem

You work on a certain topic or data set and don't know how to start off with it using MemBrain? Not sure if your net design matches your problem or if there is room for optimization? Is it reasonable at all to approach your problem with NNs? Is MemBrain the correct tool to accomplish your task and to match your infrastructure?

These questions are best placed here!
Post Reply
giselle007
Posts: 5
Joined: Sun 11. Jan 2015, 22:38

Stock prediction problem

Post by giselle007 »

Hello everybody,

I am quite new to neural networks and I have some difficulties to decide what my network should look like
and how to get started. I calculated price ratios of 20 stocks over 1 year (5 sec intraday tick data). I will try to find a correlation
between them with neural networks. To do so i use 19 stocks as input and 1 as output. I have to use every stock
once as an output so i have to do it 20 times i guess?

19 stocks in time t -> 3 hidden neurons -> prediction of one output stock in time t
(i don't predict in the future but still in time t, and afterwards i check the difference with the real value)

I want to split my data up in 4 weeks of training,1 week of validation, 1 week of test. and i want to roll this out over
the whole time period (training: week 1-4, next session training will be week 3-7, next session training will be week 5-8,...)
How can I accomplish this?
I would want to find a model that gives me quick results since i dont have so much time.
Can anybody help me? I would be endlessly grateful!

Greetings,
Giselle
User avatar
TJetter
Posts: 348
Joined: Sat 13. Oct 2012, 12:04

Re: Stock prediction problem

Post by TJetter »

Dear Giselle,

is it correct that all your input and output data shall be from the same time point (i.e. 19 stock changes from time t as input and the stock change No 20 as output to be predicted also from the same time point t)?
So you only want to train and check the correlation between stock changes at the same time points, correct?
Then you should go with a time invariant network and have a start with the corresponding script:
viewtopic.php?f=14&t=233
Your net requires 19 inputs and one output then. The optimum number of hidden units needs to be found experimentally, 3 units for the start isn't a bad idea. However, don't worry too much about the number of hidden layers and their units: About 95 % percent of your prediction quality is already determined by the quality of your data and the (invisible but hopefully present) rules behind them.
Next, you need to determine suitable normalization settings for your input and output neurons. You can either set the normalization ranges manually for every input and output neuron or use the Normalization Wizard of MemBrain in combination with a Lesson in the Lesson Editor that has the desired absolute minimum and maximum values for all neurons in it.

The script mentioned above supports training with a training lesson and automatic validation with a separate validation lesson. It doesn't explicitly support a test lesson. You either have to run the script again with a different set of lessons or try to expand it to also support a test lesson. This requires some script programming skills, however.

Kind regards

PS: I deleted the other similar post on the same topic in the scripting section of the forum to avoid confusion
Thomas Jetter
giselle007
Posts: 5
Joined: Sun 11. Jan 2015, 22:38

Re: Stock prediction problem

Post by giselle007 »

Okay, I'm trying to use this script.
I will have data for 4 weeks. I use the first 4 days of each week as training data and the last day as validation data. And I will repeat this for every week. I needed to split this up because the Lesson editor didn't accept my datafile for all the weeks at once because I think it was too big.

For the script: it is enough if I just replace the filenames in the adjustable constants section with my filenames and then run the script? Or do I need to adjust more because I have more input neurons and a different net?

string sNetName = "Stock.mbn"; // Name of the neural net
string sTrainLessonName = "Train.mbl"; // Name of the training lesson file
string sValidateLessonName = "Validate.mbl"; // Name of the validation lesson file


Does the code under this text part mean that the training only will take 30 seconds? Because I have been running the script on a training data set of 25000 observations and a validation set of 6200 and it seems like its already taking much longer than 30 seconds. Do you have an indication of how long it will take or a way to see that in the program?

// The overall training time [seconds]
uint TRAIN_TIME = 30;

Is there a way to compare the best nets of each week? Or do you have recommendations for the interpretation of the results?

Kind regards and thank you a lot for the help already provided! It was really useful!
Giselle
User avatar
TJetter
Posts: 348
Joined: Sat 13. Oct 2012, 12:04

Re: Stock prediction problem

Post by TJetter »

giselle007 wrote:I needed to split this up because the Lesson editor didn't accept my datafile for all the weeks at once because I think it was too big
As described in the other thread you can combine the imported lessons into one big lesson (stored in *.mbl format). However, for first tests a smaller data set is better anyway since you will more quickly get feeling for the data and the tool then.
giselle007 wrote:Or do I need to adjust more because I have more input neurons and a different net?
No, that is automatically dealt with by MemBrain when loading your script and lessons.
giselle007 wrote:Does the code under this text part mean that the training only will take 30 seconds?
Exactly, yes. You will have to enter some more useful number for your data and net architecture there.
giselle007 wrote:Because I have been running the script on a training data set of 25000 observations and a validation set of 6200 and it seems like its already taking much longer than 30 seconds. Do you have an indication of how long it will take or a way to see that in the program?
That probably is because your net and data may be so large that the training steps (lesson runs) take extremely long. Especially the first lesson run takes longer due to some pre-checking and internal data preparation.
You should enter some large number for the training duration for the first attepty and see how the net error evolves during training. May be you will need to cut down your net architecture or data sets to get acceptable training times. Do you want to post your net here so that I could have a look at it? You can also send it to me via E-Mail if you prefer, may be together with some data set(s).
giselle007 wrote:Is there a way to compare the best nets of each week? Or do you have recommendations for the interpretation of the results?

You should be able to establish a better/worse relation between the net candidates by their residual net error based on the validation data.
Thomas Jetter
Post Reply