Normalization, Standardization, etc. of input data

Have specific questions about how to work with certain MemBrain features? Not sure about which checkbox in MemBrain has which effects? Want to know if a certain functionality is available in MemBrain or not?

Your questions will be appreciated here!
Post Reply
MrCreosote
Posts: 55
Joined: Wed 21. Jul 2010, 18:43

Normalization, Standardization, etc. of input data

Post by MrCreosote » Wed 18. Aug 2010, 22:40

From what I have gathered so far, there are basically four options for pre-processing input data:
  • Do not pre-process the input data
  • Allow automatic scaling with limits set to the MAX/MIN of the input variable
  • Manually enter the automatic scaling limits.
  • Write script to do any kind of scaling.

SITUATION: Exceeding of Normalization Limits:


There is one situation that cannot be solved by script and that is when limits are exceeded, the current program clips that value to either the MAX or MIN defined for Normalization. This is a problem when making predictions based on new Lessons. In many cases, it is not known what the limits of the inputs will be and in many cases, they will exceed normalization limits used for training. Usually the exception is not by a large amount and usually can be handled by the network without any problems.

Once a net has been trained, that training cannot be used if it is determined that prediction inputs exceed limits.

The only way out of this right now is that script must be written to adjust every limit of every input variable to allow some margin for excess. I usually use +-10% Of course, there is always the situation that these limits will be exceeded with even newer prediction cases.

NOTE: What is particularly bad about this situation is the case when the network is retrained of very long periods of time. If a limit is exceeded, that training is no longer usable and must be repeated with more margin. Since training is path dependent, the incremental trainings must be done if the true nature of that learning is to be preserved.

REMEDY: While script can be written to add margin, there is no guarantee that this will hold for all new predictions. So a new program options is the only way to deal with this.

SITUATION: Standardization (zero mean, unit variance)

Standardization is very popular. In terms of Normalization limits or [-1, +1], zero is the mean, -1 is -Variance and +1 is Variance.

But since there is much data greater than unit variance, there is a massive amount of data that exceeds Normalization limits. This is now small exception but quite large considering 4 sigma data and fliers.

REMEDY: While none is actually required since script can be written, due to the popularity of Standardization, it might be a nice option to add to the Normalization Wizard.
_____________________________

If I have missed that these situations are actually handled by Membrain, sorry for the wasted time, yours and mine. Which leads to the question, Where are they"

Thanks
Tom

User avatar
Admin
Site Admin
Posts: 438
Joined: Sun 16. Nov 2008, 18:21

Re: Normalization, Standardization, etc. of input data

Post by Admin » Fri 20. Aug 2010, 18:10

Yes, it is true that the training has to be repeated if new data exceeds the current normalization ranges.

However, as you already said, the normalization should always be choosen with some margin. If - further down the road - these margins are exeeded again by new data then it is normally wise anyway to repeat the training since the existing training is then based on data which is of low relevance with respect to the data that now exceeds the limits.

In such a situation the re-training can be accelerated significantly if randomization is NOT performed before the training. This is due to the fact that the existing weight configuration is normally not far away from the revised optimum weight configuration that is then applicable for the new data range. This way the target net error usually is reached again very quickly.

Regards,
Thomas
Thomas Jetter

MrCreosote
Posts: 55
Joined: Wed 21. Jul 2010, 18:43

Re: Normalization, Standardization, etc. of input data

Post by MrCreosote » Fri 20. Aug 2010, 20:13

Thanks for the quick reply!

Yes, new training from last known weights is a very good idea since the new solution should be close. Randomization would throw all that away.

However, as you said, applying some margin is customary. Any chance that could be incorporated into the Normalization Wizard? (if yes, maybe consider adding standardization too?)

Thanks so much,
Tom

User avatar
Admin
Site Admin
Posts: 438
Joined: Sun 16. Nov 2008, 18:21

Re: Normalization, Standardization, etc. of input data

Post by Admin » Sun 22. Aug 2010, 22:38

MrCreosote wrote:However, as you said, applying some margin is customary. Any chance that could be incorporated into the Normalization Wizard?
I'll consider this for inclusion into a future release, yes.
MrCreosote wrote:(if yes, maybe consider adding standardization too?
Still not quite sure what you exactly mean here... Do you mean calculating normalization limits based on +/- X Sigma based on some input lesson?

Regards
Thomas Jetter

MrCreosote
Posts: 55
Joined: Wed 21. Jul 2010, 18:43

Re: Normalization, Standardization, etc. of input data

Post by MrCreosote » Tue 24. Aug 2010, 20:19

Yes, Standardization is mapping:

Mean to Zero
Variance to +1
Minus Variance to -1

While this results with a lot of data beyond the [-1,+1] interval, mathematically is presents a variety of advantages when trying to get a non-linear solution.

My mathematics degree does not help me much here, but from what I have read, I get the idea that the zero mean gets all the hyperplanes to intersect at the hyper origin.

Further, statistically, the solution to a pattern should be near this origin so with all hyperplanes present in this region, there should be a well-defined gradient and less local minima.

There is tons of writing promoting the use of Standardization to [-1,+1].

NOTE; If there are issues with values beyond this interval, mapping 3 sigma or higher to +1 would keep values within the interval while still keeping the same statistical "scale" for all inputs. Zero mean is always a given for this method. But still keep in mind that the Max/Min method of mapping is very legitimate also. You basically have to choose between Standardization and Normalization.

Post Reply