**Decision making**

**Neural networks**

**game theory**

**support-vector machÄ±nes**

**solving risk problems**

**algorithms**

### Decision Making

__There are many widely used traditional techniques of stock forecasting each with its own limits and biases.__Human-oriented approaches focus more on back-dated data while not being able to well define or relate how those back-dated financials or people actions will reflect on future performance or pricing.

__At its best the number of facts or signals that can be considered and correlated simultaneously in human-oriented approaches is quite limited and is subject to personal biases.__As

**AC Investment Research**,

**our goal is to do fundamental research**, bring forward a totally new, scientific technology and create frameworks for objective forecasting using machine learning and fundamentals of Game Theory.

**Support-Vector machine**is a tool used to present the results of a invest-risk assessment process visually and in a meaningful and concise way. Our AI based forecast ratings are designed to provide relative rankings of creditworthiness. They are assigned based on transparent methodologies available free of charge on our website. These methodologies are calibrated using

**stress scenarios.**

__We consider the full spectrum of human trading interaction (varying from data based analysis to market signals, from trend actions to speculative ones and many more)__and adapt them to the

**machine learning model**with support of engineers to mimic and future-reflect everyday trading experiences.

__To do that we focus on an approach known as Decision making using Game Theory.__We apply principles from

**Game Theory**to model the relationships between rating actions, news, market signals and decision making.

### Game Theory

Game theory is the study of mathematical models of strategic interactions between rational agents. It has applications in all areas of the social sciences, as well as in logic, systems science, and computer science. Originally it was about two-person zero-sum games in which each participant's wins or losses are exactly offset by those of the other participants. In the 21st century, game theory applies to a variety of behavioral relationships; It is now an umbrella term for the science of logical decision-making in humans, animals, and computers.

### Support-Vector Machines

*

**Neural networks**are made up of collections of information-processing units that work as a team, passing information between them**similar to the way neurons do inside the brain.**__Together, these networks are able to take on greater challenges with more complexity and detail than traditional programming can handle__.AI design teams can assign each piece of a network to recognizing one of many characteristics. The sections of the network then work as one to build an understanding of the relationships and correlations between those elements — working out how they typically fit together and influence each other.*In machine learning, support-vector machines (SVMs, also support-vector networks) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis.The Support Vector Machine (SVM) algorithm is a popular machine learning tool that offers solutions for both classification and regression problems.

### Solving Risk Problems

In many sequential decision-making problems one is interested in minimizing an expected cumulative cost while taking into account risk, i.e., increased awareness of events of small probability and high consequences. Accordingly, the objective of this project is to present efficient reinforcement learning algorithms for risk-constrained Markov decision processes (MDPs), where risk is represented via a chance constraint or a constraint on the conditional value-at-risk (CVaR) of the cumulative cost. We collectively refer to such problems as percentile risk-constrained MDPs. Specifically, we first derive a formula for computing the gradient of the Lagrangian function for percentile riskconstrained MDPs. Then, we devise policy gradient and actor-critic algorithms that estimate such gradient, update the policy in the descent direction, and update the Lagrange multiplier in the ascent direction. For these algorithms we prove convergence to locally optimal policies. Finally, we demonstrate the effectiveness of our algorithms in an optimal stopping problem and an online forecast application.

### How do predictive algorithms actually work?

1-Deep Reinforcement Learning in Large Discrete Action Spaces

Applying reasoning in an environment with a large number of discrete actions to bring reinforcement learning to a wider class of problems.

2-Deep Reinforcement Learning with Attention for Slate Markov Decision Processes with High-Dimensional States and Actions

Introducing slate Markov Decision Processes (MDPs), a formulation that allows reinforcement learning to be applied to recommender system problems.

3-Massively Parallel Methods for Deep Reinforcement Learning

Presenting the first massively distributed architecture for deep reinforcement learning.

4-Adaptive Lambda Least-Squares Temporal Difference Learning

Learning to select the best value of Î» (which controls the timescale of updates) for TD(Î») to ensure the best result when trading off bias against variance.

5-Learning from Demonstrations for Real World Reinforcement Learning

Presenting Deep Q-learning from Demonstrations (DQfD), an algorithm that leverages data from previous control of a system to accelerate learning.

6-Value-Decomposition Networks For Cooperative Multi-Agent Learning

Studying the problem of cooperative multi-agent reinforcement learning with a single joint reward signal.

7-Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence At Every Step

Demonstrating an alternative view of the training of GANs.

8-Risk-Constrained Reinforcement Learning with Percentile Risk Criteria

Presenting efficient reinforcement learning algorithms for risk-constrained Markov decision processes (MDPs) and demonstrating their effectiveness in an optimal stopping problem and an online marketing application.

Applying reasoning in an environment with a large number of discrete actions to bring reinforcement learning to a wider class of problems.

2-Deep Reinforcement Learning with Attention for Slate Markov Decision Processes with High-Dimensional States and Actions

Introducing slate Markov Decision Processes (MDPs), a formulation that allows reinforcement learning to be applied to recommender system problems.

3-Massively Parallel Methods for Deep Reinforcement Learning

Presenting the first massively distributed architecture for deep reinforcement learning.

4-Adaptive Lambda Least-Squares Temporal Difference Learning

Learning to select the best value of Î» (which controls the timescale of updates) for TD(Î») to ensure the best result when trading off bias against variance.

5-Learning from Demonstrations for Real World Reinforcement Learning

Presenting Deep Q-learning from Demonstrations (DQfD), an algorithm that leverages data from previous control of a system to accelerate learning.

6-Value-Decomposition Networks For Cooperative Multi-Agent Learning

Studying the problem of cooperative multi-agent reinforcement learning with a single joint reward signal.

7-Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence At Every Step

Demonstrating an alternative view of the training of GANs.

8-Risk-Constrained Reinforcement Learning with Percentile Risk Criteria

Presenting efficient reinforcement learning algorithms for risk-constrained Markov decision processes (MDPs) and demonstrating their effectiveness in an optimal stopping problem and an online marketing application.