Difference between revisions of "Wildcat"
(Restore from old backup) |
(Added vista) |
||
Line 1: | Line 1: | ||
[[Image:Wildcat-vista.png]]<span style="font-size:xx-small; color:#666666;"><br>photo (cc) [http://www.flickr.com/photos/robbaldwin-photography/4160011624/ Rob Baldwin]</span> | |||
Wildcat is an evolutionary prototype of a fully distributed autopoietic operating ecosystem. Wildcat is an extremely reliable, available, and absolutely hassle-, scale and maintenance-free operating ecosystem, offering computing power, storage and communication bandwidth on a fair basis using simple and elegant economics based on a universal currency unit. | Wildcat is an evolutionary prototype of a fully distributed autopoietic operating ecosystem. Wildcat is an extremely reliable, available, and absolutely hassle-, scale and maintenance-free operating ecosystem, offering computing power, storage and communication bandwidth on a fair basis using simple and elegant economics based on a universal currency unit. | ||
Latest revision as of 13:23, 21 March 2010
photo (cc) Rob Baldwin
Wildcat is an evolutionary prototype of a fully distributed autopoietic operating ecosystem. Wildcat is an extremely reliable, available, and absolutely hassle-, scale and maintenance-free operating ecosystem, offering computing power, storage and communication bandwidth on a fair basis using simple and elegant economics based on a universal currency unit.
First, let's define both autopoiesis and simpoietic:
- Autopoiesis—Self-reproduction or self-maintenance. The ability to maintain a form despite a flow of material occurring. A non-equilibrium system, typically life or similar processes but also including natural phenomena like Jupiter's Red Spot. See also sympoietic.
- Sympoietic—A more open form of self-maintenance than autopoietic, more appropriate for social and ecological forms of organization. Exhibits more diffuse structures and fuzzy boundaries.
In Wildcat, computing services on all levels of scale are offered in exchange for nanopayments using the zon, the universal currency unit for operating ecosystems. One zon is roughly equivalent to the average value of the euro and dollar.
In short, the claim is:
- Adding the concept of "money" to information technology at some fundamental level will create an autopoietic computing ecosystem at that layer and the layers above.
This exciting software experiment assesses feasability of an autopoietic computing ecosystem.
This project calls out for team members with a passion for and potential in the fields of computer science, distributed processing, intelligent mirroring systems and cache coherency, complexity theory, and finance or economics. This may frighten you, but do not worry, it is an experiment, and experiments—by definition—can not fail.
Wildcat—digging deeper
Complex networked enviroments suffer from both extreme boredom and high stress due to unequal load balancing, leaving some resources completely untouched while other work at peak performance for long periods of time. This divide emerges at all levels of scale: within a single CPU, across a multiple CPUs, within and acros operating systems, at the application layer, across software applications, and also in a human economy and social networks.
Most of our infrastructures adhere to Conway's law: The structure of an organization, and its architecture, are isomorphic. Therefore, make sure the organization is compatible with the product architecture and vice versa. The current internet displays similar traits and the architecture is reemphasized by initiatives like cloud and grid computing.
Wildcat will automagically lead to an self-organizing, self-generating, ever evolving, highly adaptive system radiating a quality without a name on all layers. Quality being defined as the contextual optimal balance between often conflicting qualities like performance, scalability, reliability, availability, security, maintainability, to name a few.
Given that there are three fundamental computing resources:
- Data processing—Consuming and processing data to produce new data. Note that processing data involves moving it around in various data storage systems.
- Data transport—Both within the same storage system and across different storage systems of the same or other type. Moving data around within a single system is not fundamentally different from moving it around on local and remote networks. Note that moving data around takes (costs) processing power.
- Data storage—Both volatile and persistent. The most significant difference between these two is that volatile storage will lose all data when there is no power or energy source.
These fundamental resources all come at a certain cost. Wildcat creates an optimal balance between them, in order to optimize efficiency and effectiveness, thereby creating the most value for all stakeholders, us humans being at the top of this computing food web. These resources differentiate themselves in systemic qualities.
For example, data can be stored in the CPU's registers: extremely fast (ns timing), albeit limited in number (8-128 64-bits), so expensive to use. There are very few of them and they are very expensive, so a primary cache is used to store more data, yet have it quickly available when needed (tens of ns, and 512 KB).
In turn, a CPU's primary cache is also made using expensive memory technology and material, so a secondary cache is added which is slower, but can store more data (hundreds of ns, yet 8 MB). The limited amount of primary and secondary cache memory quickly justifies the use of what we know as a computer's main memory.
Still quite expensive and very, yet not extremely fast, massive amounts can be made available in larger server systems. 64 GB of RAM is not unlikely for large systems. One drawback of all previous storage systems is that data does not survive a power failure. It is volatile memory.
So, secondary memory (disks) provide relatively cheap storage in large quantities and the data is reasonably quickly accessible and the data can persist and survive a power cycle. Yet larger volumes are available using CD and tape libraries, but access to their data is very slow when compared to disk and memory storage.
Today, many experts in computer science break their heads over the tactics and strategies to use in order to make optimal use of all these resources, creating high-performing caching systems, addressing cache coherency, and constantly moving data around in order to meet the functional and quality requirements of these systems.
Each and every piece of hardware and software is continuously trading of costs, speed and capacity, trying to create and maintain an illusion for its users that will fullfil their needs in a pleasant way. On-chip cahce, primary cache, secondary cache, main memory, "swap" space, raw partitions, local disks, storage area networks, tape and CD libaries are examples of hard storage or caching systems.
On the application level, operating systems, database management systems, virtual machines, HTTP proxies or caches, application servers, and even end-user applications all have caching as their primary focus, although they might not even be aware of it. They have to deliver on their service in a timely way, balancing speed of service and data consistency or integrity. Do they keep data local, to serve it up quickly, yet at the danger of becoming inconsistent with the original source? Or is data integrity more important and do they have to go out over the slow network each time? Where is the trade-off? And why? And at what cost?
Embedding a fine-grained economic foundation for the most fundamental computing resources like processing power, memory or storage, and network or system bandwidth, will automatically create an optimal computing ecosystem with zero maintenance and a highly adaptive and evolving computing environment. And, more important, on all layers.
In fact, computer scientists and human factor experts are illusionists pur sang. They joggle data, perform magical network tricks and time-shifting hocus pocus just to let us believe that "the system" is always on, always there to serve us, and always has the most up to date information ready for us to consume.
Wildcat is the ultimate computing illusionist. The David Copperfield of computing, if you will.
Optimizing for functionality and quality, and keeping up appearances is hard work and not only requires scientfic breakthroughs over and over again, but also consumes massive amounts of human resources behind the curtains to keep it all up and running, tuning it, configuring it, maintaining and evolving the system. Not only in the big data centers around the world, but also locally, when you explicitely spend the time and effort to synchronize your pad with the network.
If we somehow could reduce the need for maintaining and evolving these ever important information systems significantly, we could grow more efficient and effective systems and at a faster pace, yet have more fun in doing so. It becomes less of a chore.
Well, if the "genes" of this system would strive for their own evolution and survival, and the foundations would be able to trade resources, the system as a whole, and all the layers above it autmatically get these properties.
For example, for a photograph you need 20 MB of persistent storage for 400 years, and you currently need subsecond access to it because you want to make some changes to it. The system goes out to find a service that can meet these requirements, and is willing to pay 2 units for it per minute. Your local memory system fits the bill and offers its services. You start working on it and after a few minutes you're happy with the result and leave it as is (Note that there is no need to "save" your work, it's What You See Is What You Have. The system takes care of not losing it and securing your work).
Because keeping your photograph around in main memory is relatively expensive and volatile, the costs of keeping it there will exceed the value, and thus the system will try and find a cheaper and more persistent storage location. The local disk service comes up with a good alternative and your photo is moved there (again, at a certain cost).
The local storage however, cannot give guarantees to store your snapshot for 400 years, and thus the system goes out on the network to find services that do. One such service replies and will store several copies your snapshot on a number of very cheap and distributed systems.
Each of the storage systems by itself cannot fullfil the 400 year requirement, but by redundantly storing the data on a number of different locations, it can, and at attractive prices. These cheap storage systems could be old PCs and old disks, hanging around on the network. They may fail occasionally, or be on-line intermittently, or taken away forever (become extinct). It does not matter.
Because the data is stored on many other locations, the system will always be able to find one of the copies. How many copies does the system need? Who cares? Depending on the requirements (service level agreement), higher layers will grow or evolve services composed of more elemental storage services and take care of how many copies they need to store and where. And because each and every specie competes for survival at the lowest possible cost (or energy), you automatically get a very effective and efficient system.
And remember, it's all built in in the genes of the system, so there is almost no need for maintentance, configuration or tuning. The system will configure and tune itself! Besides that, we're freed from even have to think about synchronizing data all the time, or cache coherency. It will become an ever evolving and self-organizing system, just as life itself, yet at a deeper computing level than we're currently used to.
Besides answering some of the fundamental questions, some proof or evidence needs to be developed. An evolutionary prototype (as opposed to a throw-away prototype) will demonstrate the value and feasibility of such a system both in a LAN (high bandwidth, low latencies) and WAN (low bandwidth, high latencies) and single system (very high speed, extremely low latencies) environment by visualizing how data will flow to optimal places depending on the requirements for accessability, availability, reliability, persistency, etc.
The Wildcat prototype will take processing power, storage (both volatile and persistent) and network (or data transport) as the basic resources into account, demonstrate its power and potential, visualize it's behaviour and health in a vivid way, and will pose some new questions to be researched.
Wildcat resonates with the Armillaria project and Wizard Rabbit Treasurer.
Roadmap
A half year into development, we'll have a party to look back at an exciting first six months. We'll evaluate what went well, what can be better, what we have learned and what still puzzles us. We'll make a list of any burning desires that we have uncovered. And we'll party and have fun before we'll go off and enjoy a well-deserved vacation.
Back from vacation, Wildcat 1.0 will be launched. Journalists, investors, colleagues, friends and family will all gather and be touched by some great new original software.
The vibrant and agile software development process leading to these two events is characterized by a continuous flow of development sprints.
Deliverables
- Working, tested, integrated smooth and silky software that works as advertised featuring four specific demos (or systems tests if you will):
- Storage and retrieval of thousands of documents, both large (tens of MB) and small (100 KB and smaller);
- Compute-intensive distributed graphic rendering (e.g. distributed raytracing);
- Communication-intensive VoIP-like solution;
- A balanced combination of the above.
- Product website (public);
- Foundation for user ecosystem:
- Live graphical visualization of Wildcat running on hundreds of computers in a distributed heterogeneous wide area network;
- Comprehensive user manual as Wiki;
- User community process to collect feedback and requests for enhancements.
- Foundation for software ecosystem:
- API specification (javadoc);
- Sample code for demos;
- Compatibility test suite;
- Developer community process:
- Predictable personal and team velocity;
- Life-like personas;
- Steady release rhythm;
- Eating our own dogfood (using Wildcat to store all project artifacts).
Technology
The basic technology requirements are:
- Java platform
- Highest possible semantic level (RDF or better) for all externally visible data, including the upper level peer-to-peer protocols
- Protocol-centric as opposed to API-centric
- Open standards only
Architecture
Prior to the start of the project, an architecture study will review existing distributed architectures and economical models and will propose the most appropriate architectural foundation. The choice of architecture will not solely be a technical choice but will also include market factors like adoption, matureness, ease of development and developer community.
Key architectural characteristics:
- Fully distributed; no central nodes by design.
- Self-organizing
- Spontaneous networking
- Self-healing
- Care-free
Agile Software Development
Ths software development process is characterized by an eclectic mix of agile principles and practices:
- Test-Driven Development
- XP/Scrum/Kanban
- Continuous Integration
- Customer On Site
- Regular Planning Games
- Big Visible Charts that track progress
Incremental evolving requirements
Requirements and specification will flow in during the project. There will be no detailed and complete set of specifications available at the beginning of the project. This is for the simple reason that we don't want to build "you ain't gonna need it" functionality. The project will not build the system, it will grow the system.
Innovation happens elsewhere
The deliverables of the project are intended to foster both a healthy user and developer community. Therefore, the intention is to eventually release the software solution as commercial open source software.
Open source, open APIs, open data formats, open standards, and open minds help grow the developer community. And a fair commercial licensing scheme eases adoption with users while securing a solid revenue stream for evolving Wildcat into the distant future.
To maximize impact and adoption in both the user and developer community, it is of strategic importance to keep all information within the boundaries of the project team during the course of the project. As such, all artifacts and intellectual property remains the sole property of AardRock until the licensing and communication strategy is finalized and launched into the market.
Symbiosis
The project will pursue close collaboration with other disciplines who excell in the field of product strategy, identity, branding, interaction design, usability, marketing, sales, and customer care.
Maximal cohesion and minimal coupling between the various teams of expertise will be the key approach to secure autonomous parallel activities across teams in a symbiotic way. The various disciplines will help, augment and embellish each other.
What you can expect when joining
Passion, energy, involvement, committment, dedication, inspiration, attention, time, joy. We expect this to be reciprocal.
Roles: customer on site, coach, mentor, wise fool, inspirator.
We're all about PEOPLE | SOFTWARE | HAPPINESS™.
Please contact Martien van Steenbergen for more information.