Saptak's Diary: Algortihms

Saturday, September 16, 2017

Predictions and Suggestions from a machine learning based Algorithmic trading

#MachineLearning #AlgortihmicTrading #StockMarketAutomatedTrading #LogisticRegression #Boosting

Predictions and Suggestions from a machine learning based Algorithmic trading

An algorithm is a specific set of clearly defined instructions aimed to carry out a task or process.

Algorithmic trading (automated trading, black-box trading, or simply algo-trading) is the process of using computers programmed to follow a defined set of instructions for placing a trade in order to generate profits at a speed and frequency that is impossible for a human trader. The defined sets of rules are based on timing, price, quantity or any mathematical model. Apart from profit opportunities for the trader, algo-trading makes markets more liquid and makes trading more systematic by ruling out emotional human impacts on trading activities.

We can create a Regression formula like below :

The dependent variable is the Return on capital invested and can be run across all stocks.

Error term ei can be boosted using Boosting Algos and thus increasing the prediction accuracy.

Now how to choose your Variables and what can be the ideal STOCK Equation :

YOY Quarterly sales growth > 15 and

YOY Quarterly profit growth > 20 and

Net Profit latest quarter > 1 and

G Factor >= 7 and

Net Profit latest quarter > .33 AND

Other income latest quarter < Net Profit latest quarter * .5 AND

Net Profit preceding year quarter <= 0 AND

Expected quarterly net profit > 0 AND

Sales latest quarter > Sales preceding year quarter AND

Return on invested capital > 25 and

Earnings yield > 15 and

Book value > 0 AND

Market Capitalization > 15

AND

Graham Number > Current price AND

PB X PE <=22.50 AND

PEG Ratio >0 AND

PEG Ratio <1 .5="" and="" o:p="">

Altman Z Score >=2.5 AND

Sales growth 5Years >25 AND

Profit growth 5Years >15 AND

Current ratio >2 AND

Market Capitalization >250 AND

Sales >100 AND

Piotroski score > 7

AND

Dividend yield > 2 AND

Average 5years dividend > 0 AND

Dividend last year > Average 5years dividend AND

Profit after tax > Net Profit last year * .8 AND

Dividend last year > .35 AND

( Profit growth 3Years > 10 OR

Profit growth 5Years > 10 OR

Profit growth 7Years > 10 )

(Market Capitalization > 3000) AND

(Average return on equity 10Years Years > 20) AND

(Debt to equity < 1.5) AND

(Interest Coverage Ratio > 2) AND

( PEG Ratio <= 1) AND

(Profit growth 5Years > 20)

AND

YOY Quarterly sales growth > 40 and

YOY Quarterly profit growth > 40 and

Average return on capital employed 3Years >30 and

Price to Earning <6 o:p="">

Sales growth 10Years > 10 AND

Profit growth 10Years > 12 AND

OPM 10Year > 12 AND

Debt to equity < 0.5 AND

Current ratio > 1.5 AND

Altman Z Score > 3 AND

Average return on equity 10Years > 12 AND

Average return on capital employed 10Years >12 AND

Return on invested capital > 15 AND

Sales last year / Total Capital Employed > 2 AND

Average dividend payout 3years >15

AND

PEG Ratio <1 and="" o:p="">

Sales > 500 AND

Price to Earning < 40 AND

Profit growth > 20 AND

Debt to equity < 0.2 AND

Price to Cash Flow > 5

EPS last year >20 AND

Debt to equity <.1 AND

Average return on capital employed 5Years >35 AND

Market Capitalization >500 AND

OPM 5Year >15

AND

Net Profit latest quarter > Net Profit preceding quarter AND

Net Profit preceding quarter > Net profit 2quarters back AND

Net profit 2quarters back > Net profit 3quarters back

AND

EPS latest quarter > 1.2 * EPS preceding year quarter AND

EPS latest quarter > 0 AND

YOY Quarterly sales growth > 25 AND

EPS last year > EPS preceding year AND

EPS > EPS last year AND

Profit growth 3Years > 25 AND

Return on equity > 17 AND

Down from 52w high < 15 AND

Market Capitalization > 100

AND

Price to Earning >0 and Price to Earning <10 5years="" and="" equity="" growth="" on="" return="">10 and Dividend yield >1 and Return on capital employed >10

AND

Profit growth 5Years > Sales growth 5Years AND

Sales growth 5Years > 3 AND

Return on equity > 15 AND

Working capital 5Years back < 0

AND

Price to Earning >0 and Return on equity 5years growth > 5 and Dividend yield >0

Note : DEBT reacts inversely to the equation . Term period will be a spread over last 15 to 20 Years.

Now , applying boosting algorithm ( like XGBoost) you can reduce the error coefficients.

Based on the above equation and a little variation choosing a flattened NN( Neural Network ) below stocks can be looked upon for Indian stock market.

1) RELIANCE INDUSTRIES

2) DCB BANK

3) KAJARIA CERAMICS

4) INFOSYS

5) INDO COUNT INDUSTRIES

Monday, September 11, 2017

Why XGBoost ? and Why is it so Powerful in Machine Learning

#MachineLearning #Algorithms #Boosting #XGBoost #MLAlgorithms #DataScience

Why XGBoost ?

Xgboost is short for eXtreme Gradient Boosting package.

BTW what is boosting?

Quick Explanation

Two common terms used in ML is Bagging & Boosting

Bagging: It is an approach where you take random samples of data, build learning algorithms and take simple means to find bagging probabilities.

Boosting: Boosting is similar, however the selection of sample is made more intelligently. We subsequently give more and more weight to hard to classify observations.

Now coming back to XGBoost, what is it so important ?

In broad terms, it’s the efficiency, accuracy and feasibility of this algorithm.

It has both linear model solver and tree learning algorithms. So, what makes it fast is its capacity to do parallel computation on a single machine.

It also has additional features for doing cross validation and finding important variables.

Features - XGBoost

Speed: it can automatically do parallel computation on Windows and Linux, with OpenMP. It is generally over 10 times faster than the classical gbm.
Input Type: it takes several types of input data:

Dense Matrix: R's dense matrix, i.e. matrix ;
Sparse Matrix: R's sparse matrix, i.e. Matrix::dgCMatrix ;
Data File: local data files ;
xgb.DMatrix: its own class (recommended).

Sparsity: it accepts sparse input for both tree booster and linear booster, and is optimized for sparse input ;
Customization: it supports customized objective functions and evaluation functions.

Numeric VS categorical variables

Xgboost manages only numeric vectors.

What to do when you have categorical data?

A simple method to convert categorical variable into numeric vector is One Hot Encoding.

Tree Boosting in a Nutshell

We first briefly review the learning objective in tree boosting. For a given data set with n examples and m features a tree ensemble model (shown in Fig. above ) uses K additive functions to predict the output.

Industry Usage?

It has also been widely adopted by industry users, including Google, Alibaba and Tencent, and various startup companies. According to a popular article in Forbes, xgboost can scale with hundreds of workers (with each worker utilizing multiple processors) smoothly and solve machine learning problems involving Terabytes of real world data.

Saptak's Diary

Saturday, September 16, 2017

Predictions and Suggestions from a machine learning based Algorithmic trading

Monday, September 11, 2017

Why XGBoost ? and Why is it so Powerful in Machine Learning

Features - XGBoost

Numeric VS categorical variables

Tree Boosting in a Nutshell

Industry Usage?

Followers

Blog Archive

MY NAME IN ANANDAMELA SOME YEARS BACK