Multilayer perceptron learning capacity

In this post I demonstrate how the capacity (e.g. classifier variance) changes in multilayer perceptron classifier with the change of hidden layer units.
As a training data I use MNIST dataset published on Kaggle as a training competition.
The network is multiclass classifier with single hidden layer, sigmoid activation.

Multi-class logarithmic loss

loss function achieved

The plot shows the minimum value of loss function achieved across different training runs.
Each dot in the figure corresponds to a separate training run finished by stucking in some minimum of the loss function.
You may see that the training procedure can stuck in the local minimum regardless of the hidden units number.
This means that one needs to carry out many training runs in order to figure out real network architecture learning capacity.
The lower boundary of the point cloud depicts the learning capacity. We can see that learning capacity slowly rises as we increase the number of hidden layer units.

Classification accuracy

Learning capacity is also reflected into the achieved accuracy.

Accuracy acieved

In this plot, as in the previous one, each of the dot is separate finished training run.
But in this one Y-axis depicts classification accuracy.
Interesting that the network with even 10 units can correctly classify more than a half of the images.
It is also surprising for me that the best models cluster together forming a clear gap between them and others.
Can you see the empty space stride?
Is this some particular image feature that is ether captured or not?

Solar wind simulation: particle bursts engine

Intro

The Solar Wind Particle Burst Engine models the solar wind by simulating bursts of particles emitted by coronal holes. The idea is to simulate continuous flow of particles using discrete representation of the world. We can represent the wind as a finite number of particle bursts. The world space is one dimensional. It is represented with finite number of bins. The bins are enumerated with index. Greater the index, greater the distance of the bin from the Sun. Therefore the bin with index 0 corresponds to the Sun surface. Each bin at every particular time moment can “contain” zero or more particle bursts. At every world time tick (the time is modelled in a discrete way in form of integer ticks) the particle bursts move out of the Sun by leaving one bin and getting into another.
Continue reading Solar wind simulation: particle bursts engine

Solar wind observations by EPAM ACE

This is a part of study described in the dedicated post.

We have per minute solar wind observations (density, temperature and velocity) recorded by EPAM instrument of Advanced Composition Explorer (ACE) spacecraft. (see this link for data files)

This data from ACE can be used for two purposes. First, we can use this particular archive of observations (e.g. 2015 year) to fit the prediction model parameters. Second, we can use very recent measurements coming from ACE as predictors for forecasting the wind velocity for near future.
Continue reading Solar wind observations by EPAM ACE

Extracting possible solar wind predictors

This is a part of study described in the dedicated post.

I am going to perform data clean up and feature extraction for Solar wind model fitting. The major predictor of the solar wind is considered to be coronal holes characteristics (e.g. see this paper)

I’ve got two CSV data sets that contain quantitative features extracted from the Sun images with computer vision algorithms.
One file is “green” (193nm) spectrum portion originated features, another one is “red” (211nm) spectrum portion originated features
Continue reading Extracting possible solar wind predictors

Solar wind prediction

I’m going to do an experiment of predicting the solar wind speed near the Earth based on the series of the Sun images.

Skobeltsyn Institute of Nuclear Physics of Moscow State University publishes observation and prediction data on space weather. Solar wind prediction is also published there. My experiment is to try to build more accurate prediction based on the same initial data from SINP MSU.

The experiment is the following:

  1. Take the data from SINP MSU
  2. Prepare features data to be used as predictors
  3. Prepare observational data to be used as reference values
  4. Calculate error rates for current SINP MSU model
  5. Designing the computational model for the solar wind
  6. Fit Pulse-based model of solar wind, calculate error rates
  7. Fit machine learining regression
  8. Compare the error rates of each of the models

For each of this experiment phases I will publish a separate post.

RFc package: FetchClimate Client for R

Today my RFc package was accepted and published on CRAN.

With this package you can right now fetch the following environmental parameters

  • absolute air humidity
  • air temperature
  • elevation
  • diurnal temperature rate
  • frost days frequency
  • wed days frequency
  • potential evapotranspiration
  • precipitation rate
  • relative humidity
  • soil moisture
  • sunshine fraction
  • water vapour pressure
  • wind speed

The parameters above can be fetched for the point set of for the geo grid specified.

The original service providing data is collaborative project of Microsoft Research Cambridge and Information Technologies in Science lab where I currently work.