The acceptance of machine learning algorithms in predictive analytics heavily depends on the interpretability of the results. Standard regression techniques provide superior interpretability and allow for straightforward incorporation of expert knowledge but are often outperformed by black box algorithms in terms of predictive power. This session introduces shrinkage regression that overcomes shortcomings of standard regression (overfitting, moderate predictive performance, computationally intensive variable selection procedures) and allows the usage of very wide datasets. Steffen will demonstrate the advantages and elegance of shrinkage regression for causal and(!) predictive analytics and demonstrate why it belongs into the tool box of every data scientist.
From the start Predictive Analytics World has been the place to discuss and share our common problems. These are your people – they understand your situation. Often rated the best session of all, sharing your problems with like-minded professionals is your path to answers and a stronger professional network.
There will be two discussion rounds of 20 minutes each. So choose your two most burning topics and discuss with your colleagues.
Select your Round Table Discussion here.
- Change management on Analytics & AI (Tomasz Wyszyński)
- How to become an AI-driven company? Hire data scientists, train employees or buy tools? (Julia Butter)
- Top AI business cases in marketing, customer service, revenue management and operations across different industries (Nikita Mateev)
- Data first. Analytics second. What to start with? Collecting data or implementing analytics? (Andreas Gödecke)
- Federated networks: how will federated IT systems change data driven business models? (Robin Röhm)
Among data scientists, there is hardly a need to stress the importance of uncertainty estimates accompanying model predictions. However in deep learning, successful though it may be, there is no straightforward way to assess uncertainty. As of today, the most promising approaches to modeling uncertainty are rooted in the Bayesian paradigm. Commonly in that paradigm, we distinguish between aleatoric (data-dependent) and epistemic (model-dependent) uncertainty. In this session, Sigrid will show how both can be modeled blending deep learning (TensorFlow) and probabilistic (TF Probability) software. All demo code will be run using tfprobability, the R wrapper to TensorFlow Probability.
The success of machine learning and deep learning techniques is directly proportional to the amount of data available for training the algorithms – yet data is often distributed across different datasets and data can’t be centralized due to regulatory restrictions or fear of loosing IP. New techniques out of the field of privacy preserving computations promise to solve these problems and help to break down data silos and closed data ecosystem. This session shall give an introduction to the topic of federated and privacy preserving analytics & AI. Robin will take a look at the intersection of cryptography and machine learning and cover the basics of technologies such as Differential Privacy, Secure Multiparty Computation, Privacy Preserving Record Linkage and Federated Machine Learning. Furthermore, he will give an overview of the current tool landscape and libraries that help implement these technologies as well as provide insights into their benefits and limitations. Lastly, Robin explain which use cases can be enabled by adopting these technologies.
As a trade-off to superior performance modern ML models are typically of black box type, i.e. it is not obvious to understand how they behave in different circumstances. This forms a natural barrier for their use in business as it requires blind trust in algorithmic performance which often directly links to the organization’s profit. For example Banking regulators or GDPR require models to be interpretable (contradicting to optimize predictive accuracy). An introduction to the rising field of explainable AI is given: Specific requirements on interpretability are worked out together with an overview on existing methodology such as e.g. variable importance, partial dependency, LIME or Shapley values as well as a demonstration of their implementation and usage in R.