Thursday, December 8, 2022 5:25:59 AM

initial impressions

6 months ago
#24 Quote
Off the top of my head, here is my initial impression after using it for a day.

1. filtering for setups. Its fruitless to try and predict the markets all time time. Much, much belter to filter for specific setups. To do this, we need the ability to apply the filtering indicator to training, testing, and live trading.

2. Continuous contracts are needed but a PITA to create manually. Some tooling would help here. Especially being able to set the roll ratio.

3. My setups happen on a dozen instruments, I would like to train on all those instruments to make one model.  Not just one futures contract, but many at once.

4. The data flow is undocumented. Are you using market replay data, historical data? Is it from my local computer or from the cloud?  Is it tick data?  I tried plugging in an indicator that uses 1m data. The intermediate data file showed that indicator looking totally different from Market Replay. I cannot trouble shoot this without understanding the read path for the data. But as it is, I cannot trust the model at all if the indicator inputs are wrong.  Mostly I do need full level 1 tick data, and do not need level two at all.

5. Which data is used to form the "normal" or "do not take a trade now" background data set?  This seems undocumented. Its very important, obviously.

6. Regression usually has so much more statistical power than classifiers. Why not allow/use regression techniques alongside classifiers?

7. How is the input covariate normalization done? This seems undocumented. Are you keeping a rolling volatility estimate (like an ATR) for each indicator?  Is it a fixed, constant for all time normalization factor?

8. Feature importance output would be nice to help with feature selection when you have a ton of parameter choices for your indicators.

9. Which results are out-of-sample? Which are from in-sample? How do those two differ?  The labeling needs to be clearer.

I may think of other things, but there's a start.
0
6 months ago
#25 Quote
Thank you for your post. Here are some answers to the questions below. I'll put the responses in bold to make it easier to read.

Off the top of my head, here is my initial impression after using it for a day.

Thank you for trying out the Deep Signal Library. Those are great questions, and I'd like to add some of the questions and answers to the online help as well.

1. filtering for setups. Its fruitless to try and predict the markets all time time. Much, much belter to filter for specific setups. To do this, we need the ability to apply the filtering indicator to training, testing, and live trading.

Filtering can be done for the training set by using the DSTIsValidLongDataSet or DSTIsValidShortDataSet. These two methods will allow the user to select which data sets get added to the list of training data sets used in creating the model.

Filtering for testing and running live can be done in the OnBarUpdate method in the DSTPredictModelTest or the class that was created for predicting. The user can add whatever filter they would like in conjunction with the DSTLongSignal or DSTShortSignal methods. For example, in the OnBarUpdate method you could add a MyFilter method.

private bool MyFilter()
{
   // add whatever filter criteria you would like to use here and return true if it is good, otherwise false. This is just an example.
   if (SMA(7)[0] > SMA(22)[0])
      return true;

   return false;
}

protected override void OnBarUpdate()
        {
            // DST Library - The base OnBarUpdate needs to be called for updating the machine learning model data
            base.OnBarUpdate();            

            if (Position.MarketPosition == MarketPosition.Flat && BarsInProgress == 0)
            {
                if (DSTLongSignal() && MyFilter())                
                    EnterLong(_Shares, "Long Entry");                    
                else if (DSTShortSignal() && MyFilter())                        
                    EnterShort(_Shares, "Short Entry");  
                ....


2. Continuous contracts are needed but a PITA to create manually. Some tooling would help here. Especially being able to set the roll ratio.

You can use continuous contract data, but it needs to be set up correctly in NinjaTrader. There are various topics on the NinjaTrader website but here is a link to get you started: https://ninjatrader.com/support/forum/forum/ninjatrader-7/strategy-development-aa/96311-backtesting-choosing-contract-month#post782986

3. My setups happen on a dozen instruments, I would like to train on all those instruments to make one model.  Not just one futures contract, but many at once.

When creating a machine learning model, you can include indicator or instrument data from multiple instruments. The NinjaScript method AddDataSeries can handle adding additional instruments.(https://ninjatrader.com/support/helpGuides/nt8/?multi-time_frame__instruments.htm)

For each instrument's data, you would need to reference the BarsArray for that particular instrument. As an example, if you wanted to just add ES 09-16 for one minute data, it might look like this:

protected override void OnStateChange()
{
    if (State == State.Configure)
    {      
        // Add a 1 minute Bars object for the ES 09-16 contract - BarsInProgress index = 1
        AddDataSeries("ES 09-16", BarsPeriodType.Minute, 1);

  // Add SMA(20) for ES 09-16 to the model
  AddDSTIndicator(SMA(BarsArray[1], 20), "SMA", 2);
    }
}

The data series are added in sequential order so if you have multiple instruments you need to be sure you're accessing the correct instrument.


Continued in next post.
0
6 months ago
#26 Quote
4. The data flow is undocumented. Are you using market replay data, historical data? Is it from my local computer or from the cloud?  Is it tick data?  I tried plugging in an indicator that uses 1m data. The intermediate data file showed that indicator looking totally different from Market Replay. I cannot trouble shoot this without understanding the read path for the data. But as it is, I cannot trust the model at all if the indicator inputs are wrong.  Mostly I do need full level 1 tick data, and do not need level two at all.

The data flow is the same as running a backtest. All of the data is loaded from historical data and is selected by the user in the Data Series section of the Strategy Analyzer window. You can choose the range of data by modifying the Start date and End date in the Time frame section.


5. Which data is used to form the "normal" or "do not take a trade now" background data set?  This seems undocumented. Its very important, obviously.

The Deep Signal Library will go through each bar of data that is selected from the range in the Strategy Analyzer and find which data sets hit a long or short profit target. It will add those data sets to the respective long or short training sets and then skip ahead by the data window size plus the bars to target size in bars. If the library does not a profit target for a particular bar then it will add that data set to the "do not trade" sets and go to the next bar. There is a new feature that will be released in an upcoming version that will allow you to not skip ahead.

6. Regression usually has so much more statistical power than classifiers. Why not allow/use regression techniques alongside classifiers?

That's a great idea, we can add it to our features request list.

7. How is the input covariate normalization done? This seems undocumented. Are you keeping a rolling volatility estimate (like an ATR) for each indicator?  Is it a fixed, constant for all time normalization factor?

Normalization is done when all of the data has been captured when the backtest is done. The library will go through all of the data and look for the highs and lows for all of the indicator and/or instrument data. If the indicator data was added via the default method, AddDSTIndicator(ISeries<double> dataSeries, string name, int decimalPrecision) then the library will use the 2 x high for the high and if the lowest value was less than 0, it would use 2 x low for the lowest value. The values are normalized between 0 and 1 with the following equation:

normalized val = (x - min(x)) / (max(x) - min(x))     (https://en.wikipedia.org/wiki/Feature_scaling)

When initially adding the indicator data to be used for the model, the user can also specify their own min or max for the indicator data with the following method:

AddDSTIndicator(ISeries<double> dataSeries, string name, int decimalPrecision, double min, double max)

This will force using a particular min or max, rather than looking through the indicator data. When running live and we are passing indicator data to the model, if the value passed is larger than the value used in creating the model, the library will just use the max value.


8. Feature importance output would be nice to help with feature selection when you have a ton of parameter choices for your indicators.

Thank you, yes, this is currently on our feature request list.

9. Which results are out-of-sample? Which are from in-sample? How do those two differ?  The labeling needs to be clearer.

Currently the user can select what percentage of the whole data set they want to use for training versus testing. This is done in the Strategy Analyzer Training Parameters section with the Percentage of Data for Training setting. The text files that are generated contain data for the whole data set and are not broken down into training versus test set.
0
6 months ago
#27 Quote
On (3) above, I don't think I made myself clear/there was confusion in the interpretation. Think of a design matrix (time points as rows, columns have the covariates) that  has a top half and a bottom half. In the top half is the data from Instrument A. In the bottom half is the data from Instrument B.  If the rowcount is N, then the time stamps from 1:(N/2)  are repeated in the rows from (N/2 + 1):N.  Then you would train on that design matrix.  So in this example, two instruments are used to train, each with the same set of covariates (signals). In reality I would like to use dozens to train on, to get a more general model, and to increase the sample size.

> 3. My setups happen on a dozen instruments, I would like to train on all those instruments to make one model.  Not just one futures contract, but many at once.

> When creating a machine learning model, you can include indicator or instrument data from multiple instruments. ...
0
6 months ago
#28 Quote
Thanks for the clarification. We would need to change the platform quite a bit in order run multiple instruments sequentially, combine the data into a multivariate matrix by timestamp and then train the model. It would probably have to be a separate application from NinjaTrader. We do have the ability to add features from multiple instruments but the data window that gets added to the training set would be based on profit targets from the main data series and not the other instruments that you would like.
0