Joel Spolsky
Sometimes when you are trying to understand how the world works, basic math is enough. If we increase the flow of hot water by x, the temperature of the mixture will rise by y.
Sometimes you are working on more complex things, and you can't even begin to wonder how the input data affects the output. The warehouse seems to be doing well when you have fewer than four employees, but when you pick up the fifth, they start stepping on each other's heels and the fifth isn't doing any good.
You may not understand the relationship between the number of employees and the throughput of the warehouse, but you definitely know what each employee is doing. You can write some JavaScript to simulate the behavior of each of your workers, run the simulation and see what actually happens. You can tweak the parameters and rules that employees follow to see what can help, and you can really get some insight into the situation and then tackle difficult problems.
This is what hash.ai is. Read the startup post on David's blog, then try creating your own simulations!
David Wilkinson
Today, together with Joel Spolsky and Jude Allred, I am delighted to present HASH, the company we founded just over a year ago. We believe that most of the problems in our world arise from various information failures. Economic collapse, war, illness, choosing the right life partner or university degree - our mission is to help everyone make the right decisions and overcome information disruptions.
Brilliant innovators have sought to streamline the world's information and make it accessible to everyone, and the next step on this path is to make this information understandable and usable for everyone.
High-tech organizations with a high level of funding (such as hedge funds) are able to efficiently process huge amounts of world information, while receiving negligible income and the smallest fractions of a second in economic transactions. At the same time, the overwhelming majority of enterprises and individuals do not have the opportunity to systematically analyze all the variety of signals contained in the surrounding world.
Simulation can make the world a better place: it can improve our understanding and perception of the world around us. Simulation is not only a useful tool for human cognition, it can also enable people to create computer representations of real world problems. In fact, models are universal interfaces available to both humans and artificial intelligence, and we believe that models can become a connective tissue between the world of humans and the world of machines.
We hope that models will help people and computers make decisions more effectively. In particular, they will help promote sustainable conflict resolution, reduce and eliminate market disruptions, and help people live happy and healthy lives. And we do not want to wait for the onset of this bright future.
If you don't want to wait either, sign up now - or read on to find out more.
Origins
I used to run a digital consulting company in London that developed websites, software and run data-driven campaigns. Our company has worked for a wide range of clients: from private equity firms and start-ups to the largest government clients.
From time to time, we were faced with really interesting tasks, such as tracking the spread of diseases (for example, sexually transmitted infections), evaluating the effectiveness of measures to combat them (for example, information advertising campaigns), and optimization of advertising costs (i.e. identifying objects that affect the nodes in the networks that are most likely to prevent the spread of disease).
It turns out that there is a single gold standard for finding answers to such questions in both epidemiology and behavioral advertising - "agent-based modeling" (ABM). ABM works as follows.
- Agents represent the participants : whether they are individuals, companies, households, machines in a factory, or something else. Different models represent systems with varying degrees of detail. In theory, the "agent" could even be a molecule.
- Agents have properties , values ββattached to them, Properties vary depending on the agent. So, in a person, a property can be logical (registered voter - yes / no), numerical (annual income) or multiple choice (party affiliation).
- Agents exist in a specific environment (often in several at once), for example, in geospatial or network graphics.
- Agents are defined by their behavior : in fact, behavior is a code that describes how agents should interact with the outside world and react to it.
ABMs can be built on basic principles and are useful for testing what-if hypotheses to safely explore the digital twins of real-world systems. This makes multi-agent simulations much more useful than predicting the spread of disease and information over a network.
Solving Problems Data Science Cannot Solve
A number of complex systemic problems make predictable modeling difficult. These problems are related to agents, their properties and characteristics: nonlinearity, occurrence, adaptation, interdependence and feedback loops between them. The emerging events of the "black swan" type, by definition, are not reflected in the existing patterns and historical data, and therefore are completely ignored.
There are no systems that exist in isolation - they are all part of our complex real world, and therefore all the problems of business, politics and humans, in the final analysis, are problems of understanding complex systems. In most cases, reasonable abstraction allows us to discount most of the extraneous factors, but sometimes it can be difficult to understand what, when and under what circumstances might be of interest.
In some systems, none of this matters, but when answering some questions (for example, how can we contribute to a more stable economy or good external relations), we may be faced with questions of life and death. In order to fully understand these extremely important critical risk issues, we need to conduct a generalized search in the space in which they exist, based on the observed dynamics of these systems. Recognition of patterns and analysis of historical results alone is good for forming a basic shell, but does not give an idea of ββthe essence of the problems.
Since the space around problems representing all possible configurations of the world is much larger than the historical space in which these problems were observed, it is sometimes tempting to write off correct scientific modeling and consider it unrealizable. At the same time, proper simulation does not seek to simulate all possible versions of the world that may ever arise (of course, there are infinitely many). Rather, it helps people understand which of these versions may become reality, and draws attention to possible new scenarios that are unknown to human analysts due to the nature of these scenarios.
Crises like the 07/08 financial crash have become disasters precisely because decision-makers did not understand and did not take into account the fundamental dynamics of complex systems - the economy, in this case. Regulations such as Basel II introduced capital reserve requirements, which, coupled with market-to-market accounting practices, resulted in asset dives, with participants forced to enter fading markets, widening the gap.
While historical and fair value data can be used to prefill and backtest agency models, it is not necessary to create an ABM. This opens the door to direct formal modeling in a wide range of areas where machine learning cannot currently be applied.
Moreover, simulations combine the advantages of formal modeling with a richness of qualitative description, which makes them highly explicable and easy to understand by humans. Unlike models that sometimes look like a black box, agent-based simulations are verifiable, and users can follow step-by-step how certain results are obtained and what factors contribute to obtaining them.
So why, then, is there so little talk about simulations, and why are they undervalued and rarely used?
Modern problems of agent-based modeling
The simulation process requires a lot of effort, and the costs of maintaining, operating and maintaining simulations are high. Modeling requires knowledge of specialized tools, frameworks, and even weird proprietary programming languages. The resulting simulations are often not portable or repurposed. Where simulation logic is based on guesswork or cannot be calibrated, the results can lead to a false sense of confidence or security, which can exacerbate existing poor decision logic.
While simulations claim to be ubiquitous in the world of supply chains, manufacturing, finance, defense, and more, the market-leading agent-based simulation software packages today operate on a limited scale and are based on legacy technologies and paradigms that do not respond well. distributed computing in real scale. Their user interfaces have not changed since the 1990s, the experience of the developers who offer them is outdated, they don't run at all in the browser and on mobile devices, and users often have to deploy specialized software just to access them.
For the most part, these simulations are toy models designed to demonstrate certain dynamics and lack interoperability. Once these models are built, they become fragmented, few people share them, and no one relies on the results of colleagues in their work. Most of the constructed models are so limited (to ensure their timely operation) that they capture only a small part of the dynamics of the systems they represent. Rather than building rich virtual worlds and selectively including aspects based on the results of experiments, developers create toy abstractions that are cheap and easy to explore that don't inspire confidence in users. There is deep and justified skepticism about the "scientific" nature of these toy models,and doubts that more complex models can be properly calibrated and parameterized.
Pay attention to the challenges of finding suitable and granular data at the agent level, the difficulty of converting domain experience into code, and the wide range of structural barriers to ABM creation, and you will understand why general purpose modeling fails and is rarely used in modern business.
A simulation accessible to all
We faced many systemic problems and now we want to create system level solutions. HASH aims to solve the simulation challenge by vertically integrating the entire stack, creating a single platform for building, running, and learning from simulations.
Today we are publicly launching two parts of HASH:
- HASH Core : A web development environment and simulation viewer.
- HASH Index : A collection of simulations and modular components.
All simulations in HASH are composed of agents (represented by descriptive schemas) and behaviors (usually represented by pure functions). Agents are driven by behavior patterns, and datasets are used to initialize and update them in real-world simulations. These kits can also be used to reinforce and calibrate models. Behavior schemas and datasets bind to the corresponding objects and schemas so that developers can easily search for models using the HASH Index and combine them using the HASH Core.
All models, datasets and behaviors are available in the HASH Index. All HASH Index content is now available for free. The HASH Index is a framework conceived as a cross between GitHub and a package manager. In the future, this environment will be expanded to create an additional marketplace that makes it easier to buy and sell paid behaviors, datasets, and simulations. In our view, companies will publish free components to gain trust and credibility, and then sell more complete simulations and consulting services.
Our future plans for the H-Index include forks, branches, discussions and pull requests - we want to add functionality from Git, which, like using package managers, is now second nature to most modern software developers.
The impact of these changes on the developer workflow is significant: as the H-Index matures, industry professionals with limited programming knowledge will be able to fork and adapt (or fully implement) existing behaviors in their simulations. This will allow them to simulate complex dynamics without having to program large-scale projects from scratch.
However, work on our products is not yet complete. Even though ours is lightning fastHASH Engineallows simulations to be run at unrivaled speed, it is currently only available through the H-Core web interface, which inevitably limits its memory and CPU resources available in the browser tab. All of this means that while the H-Engine is designed to handle truly global simulations, our early beta users were limited and could only create relatively small models. So H-Core, in its current iteration, is comparable to something like NetLogo, an academic agent-based modeling tool. NetLogo is useful for illustrating the effect of homogeneous agents in complex systems and explaining the dynamics of these systems, but is limited in modeling real-world environments with high confidence or large scale. Due to these limitations,the tools to run optimization experiments (parametric sweeps, Monte Carlo simulations, and more exotic reinforcement learning) are not yet available - but very important to us.
We are releasing our roadmap for realizing these capabilities and using simulation for day-to-day real-world decision-making:
HASH Core and HASH Index are now officially in beta.
- We will be working intensively on both platforms over the coming weeks and we look forward to your input.
We are proud to announce that at the end of this year we will open the source code for the HASH Engine, the heart of our simulation system.
- At the heart of all computing in HASH is the super-powerful H-Engine, written in Rust, and already has bindings for JavaScript and Python.
- Our goal is to make the platform accessible to everyone, and to enable people to run H-Engine locally and on closed systems.
- We are currently planning to release the public version of H-Engine under an open source license by the end of 2020.
HASH Cloud -.
- H-Cloud β , H-Core ( open-source H-Engine)
- H-Core , .
- H-Cloud , HASH.
You can find out more about our upcoming products in the public roadmap at hash.ai/roadmap
We started together a little over a year ago and now we have about ten people on our team. I am incredibly proud of the team we have created and what we have achieved during this time.
We are happy to meet HASH users and launched a community in Slack, which can be accessed through the icon in the lower right corner of any page on hash.ai - we will be happy to help you build your models, answer your questions, and also accept your suggestions and error messages.
We are working on the availability and distribution of HASH to the widest possible audience of developers. The Rust engine has bindings for Python and JavaScript, but until recently, working with behaviors in H-Core was only possible in JS. We are proud to announce that behavior development and simulation in Python is now possible locally in the browser using H-Core. Thanks to Mozilla's amazing Pyodide project, we were able to implement experimental Python support into our browser-based H-Core IDE. There are some performance issues currently, but we hope we can fix them before the full deployment of H-Cloud and H-Engine (which will allow users to avoid any performance issues). Developers can now build models in HASH using Python,and also import any number of popular science packages (morein our documentation ).
To prevent information disruptions, it is necessary to create tools that did not exist before, to solve problems that cannot be solved today. We must give people superpowers, this is our mission.
If you want to build a model using HASH, you can sign up at hash.ai/signup .
If you want to participate in our mission and help everyone make the right decisions, you can publish simulations, behaviors and data for the H-Index. You can also apply for any of our open positions at hash.ai/careers .
Finally, if you are a business decision maker and are interested in learning how HASH can be applied, contact us at hash.ai/contact .
We are grateful to early HASH investors for their support: awesome community creators like Stack Overflow founder Joel Spolsky and Kaggle founder Anthony Goldblum, as well as Ash Fontane and Lee Edwards of Zetta Venture Partners and Root Ventures. We are delighted to begin our public mission.
David Wilkinson
Founder and CEO of HASH
Examples of simulations
The Prisoner's Dilemma in JavaScript (+ in Python )
Market model in JavaScript (+ in Python )
Epstein's Civil Unrest Model in JavaScript (+ in Python )
Boids
More examples here .
Find out the details of how to get a high-profile profession from scratch or Level Up in skills and salary by taking SkillFactory's paid online courses:
- Machine Learning Course (12 weeks)
- Learning Data Science from scratch (12 months)
- Analytics profession with any starting level (9 months)
- Python for Web Development Course (9 months)