Random thoughts on time symmetry, distributive systems.

On geometry, commutativity and relativity

TLDR; It all boils down to the definition of time and the rupture of symmetries.


A distributed system is a system on which code can be executed on more than one instance independent and will give the same results wherever it is executed. 

On ideal distributed system as a vectorial system.


For a distributed to work, you need a minimal property of the functions that are passed: the operation needs to be commutative (and distributive).

Be A a set of data, f, g functions that applies on the data and A[i] subset of data on instance i.

f(g(A)) == «Sum» of (f x g ( A[i])) on all instances/partitions.

Distributed system for avoiding SPOF are rerouting operations on any instances that is available. Thus the results should be idempotent wherever they are made.

We can either work iteratively on a vector of data, or in parallel on each element as long as there is no coupling between each elements (which can be expressed as for k, l with k!=l and k, l < i then A[k] dot A[l] == 0, or that each element are orthogonal/without relationships, thus the set of elements is a base of size i)

map reduce philosophy is about stating that data in n different location can be treated indepently and then reduced.


They are 2 kinds of functions (given you work on the base):
* Transformations ( V ) => V These functions applies a geometric transformations into space (rotation, translation, homothetia, permutation) also called Observables.
* Projectors  ( V ) => Vi that are reducing the number of dimensions of a problem.

Data is a ket |ai>  of states
Transformations are Operator applying on the Kets such as O|ai>  = |bi>
if there exists an operator O^-1 such as O x O^-1  = identity than O is reversible, it is a Transformation or mapping.

O is called functions
|ai> is input data
|bi> is called output



If dim(| bi >) < dim(| ai>) we have a projector
If dim(| bi >) > dim(| ai>) we have a local increase of information in a closed system.


Given a well known function that are linear we have for a composed function to be a transformation of the significant space of data the property that O x P =  P x O or that [P, O] = 0 (the commutator of f, g) then you can do out of order execution.



But sometimes Projectors and Transformations are commutative :

from __future__ import division
from random import randint
from numpy import array as a

MAX_INT =30
DATA_PER_SERIE = 10
MAX_SERIE = 100

data = a([ a([ randint(0,MAX_INT) for i in range(DATA_PER_SERIE) ]) for s in range(MAX_SERIE)])

print sum(data)/len(data)
print sum(data/len(data))


In actual CPU, DIV and ADD are NOT commutative.

time(ADD) != time(DIV), at the least reasons, because the size of the circuits is not the same and because min(time) = distance/c where c is the celerity of the propagation of the information carrier. If the information carrier is the pressure of the electron gaz in the substrate (electron have a mass, they travel way slower than light, but pressure is a force that is causal thus c is the speed of light). What is true in a CPU is also true when considering a distributed system.

Computer are introducing loss of symmetries, that is the root of all the synchronization problems.



It happens when we have less degrees of liberty in the studied system than in the space of the input.

When we do this, it means that we are storing too much data.

For storing enough data you need to have a minmal set of operators such as given O, P ... Z each operators commutating with each others. It is called a base.

Given a set of data expressed in the base, the minimal operations that are commutative are also called symmetries of the system.

Applied to a computer problem a computer scientist might be puzzled.

I am globally saying that the useful informations that makes you able to make sense of your data are not in the data, nor in the function but in the knowledge of the functions that as a pair commutes when applied to the data.

Knowing if two dimensions i, j in a set of data projected in the base is equivalent as saying that i and j are generated by two commutative operators
I am saying that I don't know the base of the problem and/or the coupling if I find to operator such as for any input [O,P]=0. // OP|a> = PO|a> THEN I discovered an element of the absolute pertinent data.  

given Actual Data |ai> and |aj> where max(i) = n
then <ai|aj> = 0 if and only if there exists 1 Projector I that projects the |ai> and |aj> on two different transformations.


The iron rule is the number of degrees of liberties of lost resulting by applying I must never results in having less dimension than the base.

First question: How to I get the first function?
Second one, how do I know the size of the base of the functions that combined together describes the system in its exact independent degrees of liberty (the minimum set of valuable data)?
And last how do I get all the generators once I know one? 

Well, that is where human beings are supposed to do their jobs, that is where our added value is. In fact, you don't search for the first operator of the base, you search for sets of operator that commutes. 

Determining the minimum sets of information needed to describe a problem exactly with independent informations is called compression.

So what is the problem with big data? And time?

Quantum mechanic/Vectorial/parallel computation is nice but is has no clock.

In fact I lie.

If given n operations [ O0 ... On ]  applied to a set of data there is one called P such as [ On, P ] !=0 then we can't choose the order of the operation.

The rupture of symmetry in a chain of observable applied to data introduces a thing called time.

As soon as this appears, we must introduce a scheduler for the operation to make sure the chain of observables commuting are fully applied before chaining the next set of operations. This operation is called reduce.

That is the point where a system MUST absolutely have a transactional part in its operations.

Now let's talk about real world.

Relativity tells us that time for any system is varying. On the other hand our data should be immutable, but data that don't change are read only.

And we want data to change, like our bank account figures.

And we also want that we don't need to go physically to our bank to withdraw money. And bank don't want you to spend more money than you have.

This property is called transactionality.  It is a system accepting no symmetry, thus no factorisation.

It requires that a chain of operations MUST not be commutative.

At every turn a non linear function must be performed:
if bank account < 0 : stop chain.


This breaks the symmetry, and it requires a central point that acts as an absolute referential (with its clock for timestamping).

Banks are smart, they just don't use fully transactionnal systems, nor distributed systems ; they just use logs and some heuristics. There must be a synchronicity time attack possible on this system.

On the other hand since operations are not possibly chronologically commutative on a computer and all the more on a set of computers, it means distributed system main challenge is «time stamping» the events.

We know since Einstein that clocks cannot be the same on a distributed system without a mechanism.
Every body thinks NTP is sufficent.

But, NTP has discreet drifts. This drifts that are almost non predictable (sensitivity to initial conditions) introduces a margin of uncertainty on time.

Thus for every system the maximum reliable granularity should be computed so that we can ensure the information has physically the possibility to be know before/after every change.

The bigger the system, the higher the uncertainty (relativity ++).
Given the reduced operations that can commute on set of data, the clock should also be computed given the length of the maximal operation.


  

An opinionated versioning system based on mapping versions string to numbers in weird base

While we have a convention in python for numbering: 
http://legacy.python.org/dev/peps/pep-0440/

We can mostly say that version numbering thanks to "Windows 9" has shed an interesting spotlight on version comparaison.

They are to tenants of version handling:
- the naïves who consider versions has strings;
- the picky people who consider version has a very dark grammar that requires to be parsed with an ABNF compliant parser.

Well, of course, I don't agree with anyone :) Versions are just growing monotonic numbers written in a weird base but they must have at least comparaison operator: equal, superior/inferior, is_in.

Naïves people are wrong of course

 

It gives the famous reasoning why windows might jump to windows 10:
But is it better than:
https://github.com/goozbach/ansible-playbook-bestpractice/blob/915fce52aa82034cfd61cfbfefad9cf40b1e4f48/global_vars.yml

In this ansible playbook they might have a bug when centOS 50 will come out.

So, this does not seems to hit only the «clueless» proprietary coders :)


Picky peoples are much to right for my brain


Yes, python devs are right we need a grammar, but we don't all do python.

Given Perl, freebsd, windows ... our softwares need versions not only for interaction with modules/libraries within the natural ecosystem (for instance pip) but it should also nicely fit in upper container version conventions (OS, containers, other languages convention when you bind on foreign language libraries ...). Version numbering needs a standard. And semantic versionning propose a grammar but no parsers. So here I am to help the world.

The problem is we cannot remember one grammar per language/OS/ecosystem, espcially if they are conflicting.

PEP 440 with the post/pre weird special case does not look like very inspired by the tao of python (at my wrongful opinion of someone who did not took the time to read all the distutils mailing list because he was too busy fighting against a lot of software bugs at his job, and doing nothing at home).

So as when there are already a lot of standards you don't understand or cant choose from ... I made mine \o/

Back to basics: versions are monotonic growing numbers that don't support + - / * just comparisons

 

Version is a monotonic growing number.

Basically if I publish a new version it should always be seen superior to the previous one. Which is basically a number property.

In fact version can almost be seen as a 3 (or n) digit number in a special numbering such as

version_number = sum(map(project_number_in_finite_base(("X.Y.Z").split(".")))

The problem is if we reason in fixed numbered based logic, we have an intel memory addressing problem since every X, Y, Z number can cover an infinite range of values they can be a loss of monotonic growth (there can be confusion in ordering).

So we can abstract version number as digit in infinite bases that are directly comparable

I am luckily using a subset of PEP440 for my numbering that is the following http://vectordict.readthedocs.org/en/latest/roadmap.html

By defining
X = API  > Y = improvment > Z = bugfix

I state for a user that: given a choice of my software I guarantee your versions number to be growing monotonically on X / Y / Z axis in fashion such has you can focus on API compatibility, implementation (if API stay the same but code change without bug, it is a change of implementation), correctness.

As some devs, I also use informally "2a" like in 1.1.2a to notice a short term bugfix that does not satisfy me (I thus strongly encourage people to switch from 1.1.2x to 1.1.3 as soon as it comes. I normally keep the «letter thing» in the last number

If people are fine with API 1 implementation 1 they should be easily able to pinpoint versions to grow to the next release without pain.

So how do we compare numbers in a n infinite dimensional basis in python ?

Well, we have tuples \o/

Thanks to the comparison arithmetic of tuple  they can be considered to be a number when it comes to "==" , ">" and these are the 2 needed only basic operations we should need to do on versions (all other operation can be derived from the latter).

Version is a monotonically growing number, but it is on a non fixed base.

Next_version != last_version + 1

if version is a number V comparaison of V1 and V2 has sense, addition or substraction cannot have.

One of the caveat though of version numbering is our confusing jargon:
if we decided version where X.Y.Z why do we expect version 2 is equivalent to 2.0.0 instead of 0.0.2?  Because when we say python version 2 we expect people to hear python version 2.x  and preferably the latest. Same for linux 2 (covering 2.x.y ...) it is like writing the number «20» «2» and expecting people to correct according to the context.

So the exercise of version comparaison is having a convention to know how to compare numbers according to API, implementation and bugfix dimensions hierarchically speaking in respect to the undetermination introduced by human inconsistent notation.


Just for fun, I made a parser of my own version string to a numbering convention including the later twist where 2 means 2.0 or 2.0.0 when compared to 1.0 or 1.0.0. It addresses the examples to solve given in PEP440


It can be seen here.


Wrapping up


For me a version is an abstract representation of a number in infinite base which figures are hierarchically separated by points that you can read from left to right.
I am saying the figures are made of a tuple of two dimensional space of digit and letters where digit matters more than letters. (Yes, I am putting a number in a figure, it is sooo fractal).

But most important of all, I think versioning string is a representation of a monotonic growing number.

I am pretty sure PEP 440 is way better than my definition is has been crafted by a consensus of people I deeply respect.

My problem, is that I need to achieve the same goal as them, with less energy they have on modeling what a version number is.

That is the reason why I crafted my own deterministic version numbering that I believe to be a subset of PEP 440.

Conclusion

 

My semantic might be wrong, but at least I have a KISS versioning system that works as announced and is easily portable and for wich I have a simple grammar that does quite a few tricks and an intuitive comprehension.

And human beings are wrong too (why version 2 is 2.0.0 if compared to 2.1.1 and 2 if compared to 2.1 or 2 if compared to 3), but who cares? I can simply cope with.

NB it works with "YYYY.MM.AA.number" (SOA) scheme too,

PS thinking of adding y-rcx stuff by slightly enhancing the definition of a figure.

PPS I don't like talking to people normally so I disabled comments, but for this one I am making an effort : http://www.reddit.com/r/programming/comments/2iejnz/an_opinionated_versioning_scheme_based_on_mapping/
because I am quite curious of your opinions

Perfect unusable code: or how to modelize code and distributivity

So let's speak of what and un/deterministic code really are.

I am gonna prove that you can achieve nearly chaotic series of states with deterministic code \o/

Definitions:

Deterministic: code is deterministic if the same input always yield the same output

Chaotic: a time serie of value is considered chaotic if knowing of sample of t-n samples cannot make you able to predict the t+1 term. 

Turing Machine: a computer that does not worth more than a cassette player.

Complex system: a set of simple deterministic object connected together that can result in non deterministic behavior.

lambda function: stateless functions without internal states.

FSM (finite state machine): a stuff necessary in electronic because time is relativistic (Einstein).

Mapping: a mathematical operation/computer stuff that describes a projection of input discrete dimension A to output discrete dimension B. 


Now let's play real life turing machine.

Imagine I give you an old K7 player with a band of 30 minutes and every minutes I tell the result of n x 3.
If you go at minutes 3 the K7 will tell 9.
If you go at minutes 5 you will hear 15. 

This is the most stupid computer you can have. 
My tape is a program. The index (minutes) is the input, and the output is the what is said. 

So let's do a python Basically we did a mapping from the index on the tape in minutes to integers that yields index(in minutes) x 3. 



So what do we learn with this?

That I can turn code into turing machines, that I can use as a code with a 1:1 relationship, I have a ... mapping \o/

What does compile does?
It evaluates for all the input possible that is an integer belongs to [0:255] all the output possible of boolean function. It is a projection of [2^8] input => 2 output
I projected a discrete space of input to a discrete space of output.

Let's see why it is great

My code is fully deterministic and is threadsafe because my code is stateless.

It is an index of all the 256 solutions for f(x) for every possible values.

if I encode a function that tells if a number can be divided by X another one by Y to have the function that tells if a number can be divided by (X * Y) I just have to apply then & (bitwise and operator) to the int representing the code.

An int is a very cool for a storage of function.
With div2 / div 3 I can by applying all the «common bitwise operator» create a lot of interesting functions:

div2xor3 : a code that indicates number that can be divided by 2 or 3 but not 6
not div2: every even/odd number
div2or3: multiple of 2, 3 and 6
div2and3: multiple of 6 only
....

I can combine the 16 bliter operations to directly obtain functions.

In functional programming you do partial function that you apply in a pipe of execution, here you can directly combine the code at «implementation level»


My evaluation is always taking the same number of cycles, I don't have to worry about worst case, and my code will never suffer from indetermination (neither in execution time nor results). My code is ultimately threadsafe as long as my code storage and my inputs are immutables. 


My function are commutative thus I can distribute them.

div2(div3(val)) == div3(div2(val)) (== div6(val))

=> combining function is a simple and of the code

Why we don't use that in real life

First there is a big problem of size.

To store all the results for all the possible inputs, I have to allocate the cross product of size of input * size of output.

A simple multiplication table by 3 for all the 32 bits integers would be 32 bit * 32 bits = an array of 4billions worlds of 32 bits. 16Gbytes!

Not very efficient.

But if we work on a torus of discrete value, it can work :)

Imagine my FPU is slow and I need cos(x) with an error margin sufficient to only work in 1/256 of radians? I can store my results as an array of precomputed cosinus value expressed in fraction of 256%256 :)

A cache with memoization is also using the same principle.
You replace computing code that is long by a lookup in a table.

It might be a little more evoluted than reading a bit in an integer, but it is globally the same principle.

So actually, that is one of the biggest use of the turing machine: efficient caching of pre computed values.

Another default, is that the mapping make you lose information on what the developer meant.

If you just have the integer representing your code, more than one function can yield the same code. The mapping from the space of the possible function to the space of the solutions is a surjection.

Thus if you have a bug in this code, you cannot revert back to the algorithm and fix it.

if I consider I have not a number of n bit as an input but n input of 1 bit constituting my state input vector,  and the output is my internal state, than I am modeling a node of a parallel computer. This «code» can be wired (few clocks costs) as a muxer that is deterministic in its execution time and dazzling fast.


What is the use of this anyway?

Well, it models deterministic code.

I can generate random code and see how they interact.

The Conway's Game of life is a setup of turing machine interconnected to each other in a massively parallel fashion.

So my next step is to prove I can generate pure random numbers with totally deterministic code.

And I can tell you I can prove the condition for my modified game of life to yield chaotic like results is that the level of similarity for every code on every automaton is low (entropy of patterns is high) AND 50% of the bits are 0/1 in code (maximizing the entropy of the code in ratio of bits).

 






Date convention: an extra-ordinary conservatism that is screwing are life and code

The other day, as distracted as I am, I had to reprint, re-sign and resend all my forms.

How?

I had to write the date in numbers.

Well september sound like seven/septus (in latin) thus it is 07 wrong it is 09. Every time I get caught.

What is this mess?

Days of the weeks are supposed to be the 7 observable planets (at the time the convention were set) of the solar system (used in astrology, a sign of scientific seriousness).

Some months are 31 days because some emperor wanted to be remembered as "as great (august) than other's" (julius).
It is not beginning on the shortest day of the year, else it would be a pagan fest discrediting the roman catholic empire. But it must have began in march once in the past (thus sept 7 ...december 10).
It is made of 12 solar month because of superstition that makes 13 a dangerous number...

And  in computer coding it is a fucking mess.

Not to add the local timezone unreliable informations who are so versatile because of political conventions that makes it impossible to time stamp information in a real reliable way.

Why all of this?

We spend a non neglectable time and energy with dealing with this hell, while trying to promote progress; something that easily substantially improve our lives.

We can change protocols, language in a snap of fingers to accelerate speed of software execution, but one of the biggest burden for reliable data storage is still there under our nose. And we are blind; it is the date mess.

Let's face it: a change would be welcomed and cool with mostly benefits.

Dropping superstition and politics from our calendar would have more advantages. Here is my idea:

First I like solar calendar (a calendar that begins or ends when the distance to the sun is either the biggest or the smallest). The shortest day being the winter solstice on 21st of december. It used to be a "pagan" fest. But who cares, it is fun to have a feast when the worst of winter (or the best of summer) is at its best.

Then 365 = 364 + 1 = 13 x 28 + 1

Well, 13 months of 28 days :) it is coincidentally a lunar month, which in itself would make it an hybrid catholic/islamic compliant calendar.

When to add an extra day and where?

Well obvisouly when given a certain point the day the shortest officially is not the same as the one from the sun. Which day? Let's say that the world special day of every year is new year's eve, it could be nice to have also a day off on the longest day of the year to enjoy it on bissextile years. Or share the pleasure of enjoying new year's tradition with people from the other hemisphere in the same season :)

This calendar will help the kids make sense of the world.

Knowing that Janus was an old forgotten two faced god representing the present existence (one head looking to the past/memories the other to the future/projects) worshiped by a gone civilization won't help kid much more than learning of seasons and when and why it happens this way.

So I have a simple internationally translatable for naming.

Kids will learn day and math at the same time:

The first day of the week will be 0D or zero day then 1Day/Tag/Dias/whatever, then 2D ... .  By convention to have given us the actual calendar latin representation of day could be D like the first letter for day in latin.
 
The second would be incremented by 1 and so long. Every culture could use whatever sound nice for day's name and try to relate them with numbers. It does not have to be said one, it could be mono, uno, ein, the day of whatever come by one.

This way, kids and people learning a new language learn basic numbering and week days at the same time...

I propose it begins like hours with 0 this way we have good algebraic rules for calculating and it makes day/time math consistent with each others. it begins with 0 like hours.

A work week is thus an interval [0 : 5]
the week day in 4 weeks + 5 days = d + 4 % 7

And NOW I could write my date in numerical format the same way I know the day in the "usual way of saying", I don't have any more traps when I want to know what is the day of the week in 16 days. I don't need conversions ... I am happy to forget a load of useless tricky informations.

Any way, with my idea, we introduce math and knowing days of the week at same time in school.

Okay there is the case of the intercalar day that is special. For this one there will be rule, but now, at
least time calculus become as consistent as the metric system. And we can eradicate this Aristotelian stupidity that all should be periodic and harmonic in a science to be true.

We make them learn calculus in different bases (24/60/7/13/365) and a tinge of introduction on the reality of the world (telling that days don't make 24 hours because there is an epsilon of chaos in the real world that don't match our first models, and that all periodicity don't always match real numbers. So it is cool. It is a lot more information packed in a single convention.

We learn kids how to deal with inconvenient maths and how we human face with it. And learning compromise is cool.

Kids can be taught to watch the moon, the sun, the season, and every things makes sense.

You can introduce trigonometry with calculating hour of the day with the course of the sun. You can tell the story of how with a camel and a stick you can compute the earth radius...

You can relate the abstract universe and its vastness with our all days life. You can encompass a great deal of our civilizations' progress in simply changing the date conventions. Indian, Arabs, Europeans, Mogol, Chinese, Babylonians ... they all gave us something worth in the new calendar. It is not a calendar based on forgetting the past, but at the opposite it is the topmost conservative calendar. It will more dense in knowledge and culture and values than ever. It becomes a compressed set of knowledge, easier to manipulate even for those without knowledge.

Sticking to date mess is not technical issue, but a civilization one: it is just us being cavemen hitting on our keyboards respecting weird divinities of the past in superstition. Superstition, even disguise in the technical words of cargo cult science is still superstition. Blindness based on spending billions on software bugs due to stupid convention seems less efficient than fixing the real world. It  would improve our lives even out of computer world.
 But I still write software to makes thing easy for people that want a progress that makes their life easier and not wishing to change what gives us trouble in the first place: not computer or math, just stupid conventions..

Practical sense and lazyness are pretty much the 2 main qualities of humanity I would like a system to make us share. Not a strongly superstitious anachronic history loaded artifact that screws my every day life. So that's my propposal for a better world.


Non linearity and clock in distributed system

How the word distributed matters

Let's imagine Einstein existed, and two observers live with different clocks.

According to their differential acceleration, their clock diverge according to a 3rd observers.

Let's name observers 3 the user. And two components of a distributed system A & B.

Let's assume A & B have a resource for treating information per seconds called bandwidth. Let's assume that the more load, the more it takes time to treat instruction. Let's say that it means the clock «slowed down».

Let's assume that tasks are in form f x g x h .... (data). where f g and h are composable functions. And let's assume A or B can treat 2 sets of functions that can be overlapping.

Now let's make a pun a choose only function that supports distributivity vs non linear functions and see what happens.

if  {f, g, h} is an ECOC (Ensemble Complet d'Opération qui Commuttent) then f x g x h gives the same results as h x g x f applied to the data.

Thus, the distributed system does not need a clock, because g(data) does not have any hidden relationship with f(data) in terms of chronology. It works perfectly has just message iteration between functions. The function can be applied in any order. No chronology needed.

Whereas if you use non linear functions you introduce time, and your system must have a central clock that will be the limiting factor of your «distributed system» aka the fastest speed at which you can guaranty transactionality.

Ex: banking system.

I cannot let an actor spends money if is credit/account = 0.

So if I send in parallel the actions of putting money and retrieving it there is an order. I cannot let a user trigger withdrawal after withdrawal (that are fast operations) as long as the slow accounting informations are not treated.

I have to put a lock. Something to ensure the absolute state at a reference moment called now. As a result I cannot distribute in an even fashion on a distributed system  my task. I have a global Lock that must exists.

Because of pure geometry acquiring and releasing this lock has a minimal incompressible delay.

Inside a single CPU : Speed of light, centimeters, nanoseconds.
Inside a LAN : Network latency, meter :  , X ms
Inside a WLAN: kilometer XXXms.
Worldwide: X Seconds.


Since there can be no global states without information transmitted, a global state requires the introduction of a unique absolute arbitrer.

The more distributed your system, the more redundant, the more your clock diminishes if you have transactions. That is an incompressible limiting factor.

What happens with Commutative operations?

Well, they are dazzling fast. You can implement a robust dazzling fast system distributed system.

Commutative functions can be executed out of order.

They just require to be routed as mush time needed in the right circuitry.

Meaning you can avoid a congested node easily, because you can reroute actively at the node level without knowledge of the global system, just the state of congestion of your neighbours.


An efficient distributed system should support only distributive operations that are asynchronuously delivered. And, reciprocally if you want want an asynchronuous system that can scale, you should not accept non commutative operations. 

The commutative part should be distributed, the non linear one should be more logically treated ideally on a strongly synchronuous not distributed system. Systems where the clock is ideally one cycle, thus ideally close to the metal using all the trick of atomic instructions.

Oh, another lad is asking why you can't use your fancy nosql distributed system to have a nice leaderboard for his game.
Well, a > b is a non linear system, sorting is non linear. So well, somwhere a clock or a lock or a finite state machine, or a scheduler has kicked in.
Non linear operations introduce a before and an after. With loss of informations.

if you need the result of a>b then a and b being 2 distributive operations, then you have to wait for both a and b before processing in non reversible way.

To ensure everything was made in the right order, in a non distributive way; there had to be time, so that action are made in the right order. Every non linear operation in a distributed system introduces an hidden clock.

Are non linear operation bad?

I call them filters. I think they are a hell of a good idea, but I say it is very hard to make them distributed. So we should live with them and architecture our distributed system accordingly.

(PS in the map reduce idea, Map can be seen as dedicated for stuff that are commutative, and Reduce for the non commutative one)


PS Let's imagine a distributed system with Forth or RPN

if
def Radius: square SWAP square SUM 
def square : DUP *
def sum : +

Can I write a²+b² as distributed system?

Data Stack : a b
Exec distributed Radius

Can I play the operation of square, swap square sum in any order?
No.

f(a, b) = f(b,a) thus it is commutative. But what is the problem?

The use of a stack introduce a scheduling because there is now a relation ship of order on the application of the operation hidden in the data structure/passing. so a queue is also a introducing a clock.
Distributed system should be wired (message passing) and not programmed (actively scheduling tasks).

Heavyside function: a systemic mathematical root of social inequity

Abstract 

Just a random theory for fun, nothing really serious.

Assumption

  1. I hate analytics, so it will be a formal reasoning;
  2. We consider that social inequity is the inequity in front of the differential between tax being paid, and tax being received as n nth order;
  3. We consider that the social agents interact has entities formed in a network and that they tend to be over represented the more «utility/wealth» they have (I still don't know of any hobo making it to the parliament);
  4. We consider that part of this interaction are with a special entity called «state» that have feedback loops on the agent:
    1. some for taking (VAT, IRS ...);
    2. some for giving back (education, health...)
    3. all these agents are interconnected and may have delay in propagation of the feedbacks;
  5. We consider that there is an agent called parliament that can interact with the «state» in such a way it can changes the network and functioning of the agent;
  6. We consider that each the utility function for a given agent to set his choices are based on a rationnality that is based on 
    1. sum of dis/imitation of a neighbourhood, 
    2. global rationnality (the mathematical choice that maximize my utility);
    3. and temperature (random factor where you put moral and stuff);
    4. temporal rationnality (based on a short term memory);
    5. partial access to the information related to utility/state vector of the agent;

By the systemic nature of tax reversation our societies are bound to tend towards extremely inequal society

So basically we have a complex system. A set of simple systems interconnected together. It belongs to a young branch of mathematics called «complex system».

These systems are quite unnice, it is very hard to analyse them mathematically even though some statiscal physics can help. Simulation can help. But reasoning is better.

So what my beef is all about?

The BAD Guy

This is the problem !


This function is a non linear function. If I introduce it in any equation, I cannot use any mathematical means to predict tendancies. Averages, trends, estimations cannot work per nature with these functions. So it means, mostly all predictive models based on «linear algebrae» such as matrix, average, derivate, estimation don't work.

And, I pretend I can solve it.

Juste let's acknowledge that laws have effects.
Let's acknowledge that law is often formulated with stuff such has : IF Income > xk$; THEN pay x% taxes;  ELSE pay y% taxes.

So we have clearly my bad guy hidden every where.

Now, let's be fun and imagine the utility (money) flowing from each of these cellular automata based on the hypothesis there are evenly distributed wealth at the origin and that interaction are randomly distrinuted.
At some turn out of n taxes are paid,
At n turns income can be randomly given based on the discrete state of the automata;
At some turns the automata rules are changed by a subset of the people with more utility.

So the question, is how will it evolve?

Well it is like visualizing a huge body with cells and  heart pumping. Which is nice.


I can predict that if there are heavyside functions used by the «state», then it will evolve more often, and with bigger amplitude towards unfair system than system with linear functions and no binary criterions...

The problem lies in the fact there are «acausal» stuffs in this system. Or retroaction delayed loop. And they tends to amplificate violently.

Acausality means that an effect can have an effect on the cause (but always later in time). Taxing too much people will impact next years potential income. The state wealth is like a gigantic bath tub but globally it requires sum(income) == sum(outcome) and the income are solely taxes (I cheat, I know, I am closing the «state» system whereas it is an open system).

You will notice that the time constant for a feedback loop varies. The Revenue taxes will need one year to retro propagate, while VAT feedbacks almost immediatly. Thus there are asymmetries both in time constant, and amplitudes of the feedbacks.

So know, We hit the run button of the simulation.

We follow agent 1 that is randomely chosen to need money from the state (food stamp? parental break? Sickness?...).Utility increase.

Another agent may lose money coincidentally at the same moment (parking ticket, donation, ...). Utility decrease.

Now, we could imagine it is already the turn for paying your annual income tax.

And, there is this heavyside function tearing appart 2, 1 and the rest into a cluster...

Time in this asynchronuous system, is the accident of the accident.  Every time a transaction is made amongst agents, time increases (discretly).

Randomly things will happens; with the same odd for everyone, unless their utility is null. When utility is null, you can't play outgoing interaction.

Now, 3rd turn, we are already playing the election, 1, 2 are either above or below the utility of the crowd, so their odds of playing the election games are disctinct.

At 0 utility you cannot play the game of election.

I make the following assumption: probability of being elected is represented by a non linear but growing function of the utility (wealth) that is 0 for 0. Given the right utility (if you have money, but no time, you don't have «available wealth» for an outgoing interaction). Rules name : "pas de bras, pas de chocolat".

So in the decision of presenting myself I have to set my decision based:
on my current mathematical interest;
odds of winning;
and my «imitation factors».

Statistically, it is small, and should be considered as the same kind of noise as random photons exciting the oxygen in the sky...

But, let's add a little realistic bias to the agents:
they have a small term memory;
they have all the more chance to predict future that they have education.
In fact, this is too much a strong hypothesis. Let's just say something regulated by the state create an assymmetry of information. I won't treat the case of multiverse rationnality per agent, but it should be treated. You can modelize them as sets of relationship to information some being random (religious interpretation or a star being behind the sun), some being relevant (bribery) plus a set of rules to edict a future outcome based on «values». These have of course a feedback loop from the taxes. These surset of individuals are moral persons, thus almost regular agents (polymorphism), like religion/schools/companies. For each of the surset an indivual is in, the agent has an access to rules and information based on a ratio  of «fit with my own interest according to my memory». And this agent can't be at the parliament, but they can increase the odds of winning for people belonging to their surset.

For the sake of realism, we will consider the lower this fitness variable is, the more an alteration of the information is, so we corrupt randomly either a relationship or a rule.

But, it is way to hard to code, so let's try the simple model that does not change much: a global major education (jacobinism) for the network of agent, and that education is mostly a question of where you live, inclusive or of how much money you have (through both your patrimponial, and indirect income).

The agent still have a short term memory and partial access to the information based on flags describing its cluster.

Well at turn free I have 2 possibles outcome: 2 clusters of 1 and 1 cluster of n - 2, and 1 cluster of 2 plus the other ones.

Not much.

But everytime an event happens with the simple fact there this gaps, it repropagates.

And since we said nothing about the height of this gap, it can make the difference between Charydbe and  Scilla.

Imagine that you go in jail? You cannot earn money, you can't play the game of election.

Imagine thanks to the tax system you have a wonderful contract from the DARPA. You are a cluster of one, but your utility for trying to change the system because of potential bad suprises raises. Who wants to pay taxes when you can rationnaly avoid it with less investment in utility?

Plus the more education you have, the more you share you rationnality with other agents the more you see the retroaction loops and can predict the future (given your rationnalilty favours your agent) and you acknowledge teh utility of sticking together. Karl Max's Capital at my opinion was more useful for the powerful to understand the need to act as a class because they benefit the most of it. I sometimes wonder how much Karl Marx helped the emergence of the capitalism he was strongly denouncing.

And remember, my situation impacts the neighbours on the network (wife, family ...).

Then the more you see the retroaction feedback loop join interest with yours, the more likely you are to adopt interacting with the agent... fast, and spreads the more in amplitude. It is strongly contaminating the more it benefits you.

Every agents have their time of reactions, based on the channel of informations.

Ex: some people knows the Fed's new rate before they are even announced on the market. [find the link with order at 14:00 in NY while announce at 14:00 Chicago]

So ... why does Heavyside make a difference?

This wheel of fortune non linear effects happens also in a system without gaps.

The difference is that if you happen to put a continuous function, the result will results in smoothing the non linear effect after n turn. New comers will come and live with the favored according to a progressive effect.

Their will still be local optimum with linear functions that will make small valleys of clusters. But the depth of the will be smaller. 

The heavyside function will of course clusterize MORE the population with more impacts. Putting a binary flag state for every single steps introducing discrete domains with distinct rationnalities, you try to know best the channel that favours you best while risking less. Some will have interest in changing the laws to favour the conservatism of the situation based on their interests, others will have a rationnality of changing the «winning domains». Just random stuff you could modelize. Without knowing anything you already know that the good rationality will have to favour cluster effects. Because the symmetry in the cause will have an impact in the effect. And it will be all the more efficient that it feedbacks positively. All winning rationalities in an Heavyside based complex system WILL favours strong discriminations that favours the clusters created by the intial Heavyside function. Here is the Capital's central thesis: there is a clear mathematical incentive for the more powerful to regroup together and since they are favoured in their probability of having a positive action on the system for them, to favour clustering for their better good. They should favours laws that works all the society if is (arbitrarily) fractionned by bigger gap.

I don't say every agents in the same conditions share the same rationalities. Warren Buffet or Bill Gates asking to pay more taxes seems to contradict me.

It is just an effect of number. Of imitation, spreading of information, of majority of behaviour, and of cumulative re enforcing effects.

The existence of the bias in the representation/power, favour the systemically the strong clustering of conflicting rationalities artificially. And it is at my opinion very hard to say whether education makes wealth or the opposite.
So saying the favoured clusters in terms of wealth OR education will have a tendency to be over represented in parliament is clearly the right way to say it.

Saying the more represented will favour their interest is kind of a trivial fact.

Rich people without education (no information, just lucky guys (won loto) won't care.

Poor people without education won't care.

Rich (favored by the clustering) and relatively poor people but acknowledging the bias (un favored by the clustering) with education will care to change the system.

Now, if we introduce the fact there is a clash when the tension is too big (we can measure an antagonist rationality between two clusters that is more than a certain amplitude) then it is becoming unstable.

Every agent will tend to try to choose the information node/rule set that will best its interests according to its rationnality.   

But, thanks to the cluster and the nature of the heavyside function and all this binary flags introduced by the heavyside functions, people will punctionate their income through differents paths that requires differents sets of informations.

Thus we have diverging rationnalities. And given enough education, their must be a conflict. If you see you have no chance of filling a gap, you don't try to filll the gap, you just change the gap.  People will mechanically fight other belonging to arbitrary domains.

The funniest conclusion is that in my model, the 99% should be called the 1%, and the 1% should be called the .01%

1% vs .01% is the fight between (have favouring clusters and access to information) vs (don't belong to the more interesting clusters but have access to enough information to see it, or is favoured but has an opposed rationnality).

The simple fact of criticizing the 1% is already a proof you belong to the 1%.

The 99% movement, the occupy wall street stuff is not about trying to solve the inequity problem, it is about asking for a new order because it is a rationnal choice for people that just want to be in the .01% and don't have access to it, yet.

Political disclaimer: I belong to the movement «we should all be the 100% and living happily ever after». The 100% in short.

So now, one big question. Is it intrinsically bad to have a system that is more unstable than it should be overwhise? Is the discrete clustering bad?

Let's rephrase, do you prefer the funkyness of war, or the boringness of happyness and peace? Well it depends of course if you have to die at war or earn money from it.

A system that induces systematically arbitrary clusters of population that amplifies have less chance to be stable than a system without clustering.

In natural language; a society were rules are applied without any discrimination on the nature of the citizen is less likely to tend towards unstability and strong auto amplifying discriminations.  These discriminations are purely mathematical amplified artefacts. Should we let artefacts rule our lives?

It is is kind of better when in a society people share the same rationality, and there is less paranoïa when the information is more symmetric.

Our systems are thus chaotic by nature, and more stochastically behaviouring than they should, just because of a stupid function that introduces an arbitrary amplification on discrimination. It should be fixed. The laws should be rewritten to get rid of the all the possible formulation like: IF BLah THEN this ELSE that.

I am agreeing strongly on the importance of sanctioning wrong behaviours or protecting the youngest (which are strong heavyside function), I don't agree with the multiplication of unnecessary non linear clauses (IF SEX | EARN more than x$.... THEN ....) in our social systems. They cluster us, and they make the effect of the law unpredictable thus arbitrary. And as a human, I prefer control.

How could I prove I am right/wrong?


Well, if I were serious I would bring proof. So I would have to make a simulation, give data, and make a model. Than I would claim to the world that I am an unrecognized genius, but I don't care. I am just waiting for my wife to come back, and it is my way of relieving the stress.


However, I gave a try at multi agent simulation. https://github.com/jul/KISSMyAgent
It could be used to modelise this. And, I am pretty sure by running a lot of simulation we will find the properties of our whole known systems (democratic, republicans, communists, monarchists) will be prone to this effect.

But I hated the programmation of this stuff. So I don't recommend it.

I went on an implementation based on distributed agent: https://github.com/jul/dsat

I began to use it in conjunction with graphite/carbon to store results. But, it is faster to run the simulations in my head than on the computer, so I prefer to go directly to the results. ;)

So there it was, a recreational theory that probably is useless, but it was in my brain. So I unloaded it. 

Just for fun, I just described a purely asynchronuous distributed system.
It means, that with too much non linear interactions, any real distributed systems (the cloud, big clusters of distributed applications) also have this unstability properties. 

Just think about it: I am saying that the cloud will be unstable one day by nature, I am saying the day it will break it will break in a massive violent snowball effects the more non linear rules are introduced (non linear: swithching traffic from interface, rejecting jobs on timeout, according more resources to tasks that are already greedy on CPU instead of fixing the algo...). And since the effect is non linear we have no assessment possible of when and how. I have a strong suspicion the breakout will be violent and undetectable. One day, you will wake up with an irreversible situation that will affect you without any possibility to foretell it.  Thus, no insurance can cover this phenomenon. No science... yet. We don't have mature analytical, theoretical and empirical tools to studies these. 

I would be you, I would not rely on systems that are chaotic and built by engineers that don't seem to see any problem with that. You just are having a system that is stable as long as a given piece of network equipment in China doesn't flap its small BGP wings too much but oddly resists to people trying to destroy its backbone with nukes....

I just hope it does not happen before I am retired. ;) On the other hand, I am just a single guy without any credibility, and I seem to be a little to dramatic. So, let's say it is just another stupid theory with no interest.

RFC 01 Human Handshaking protocol for instant messaging (work in progress)

Abstract


Instant Messaging (IM) can be disruptive and cognitively hard to handle because it requires context switching. This results in 2 potentially counter productive effects:

  • lowering the quality of the conversation for both parts that are not equally concentrated;
  • it can introduce a repulsion towards this protocol.

Since this is a human problem, this proposal is a human based solution.

Proposal


When you want to talk to someone you ask for «real availability» and a «time slot» and a «summary» of what you want to talk about  given a «priority». It is in the interest of both party to agree on something mutually benefiting.

The idea is to propose a multi cultural loosely formal flow of conversation for agreeing to a talk in good conditions.
 

Implementation. 


Casual priority is fine and is the only proposed level.
Default arguments are:
  • time slots : 10 minutes (explained later). NEVER ask more than 45 mins;
  • summary : What's up? (salamalecs explaiend later);
  • priority : casual (except if you want people to dislike you).

 

Time negociation


Ex: «hey man, can you spare some 10 minutes for me?»


The interrogative formulation should put your  interlocutor at ease so he understands he can refuse or postpones.

Asking for an explicit time slot helps your interlocutor answer truthfully.

If the receiver is not answering it means he or she cannot.

Don't retry the opening message aggressively. Spacing gracefully the requests should be based on the historic of conversation you had. If you had not talked to someone over 1 year, don't expect the person to answer you back in 5 mins, but rather in the same amount of time since you last interacted.

If you really want to push, multiply each retry by an order of magnitude. Min time for repushing should be done according to the how busy your interlocutor is, your proximity with the person, and your «average level of interaction» on a rough moving average of one month.

It should never go below 5 mins for the first retry (with a good friend you interact a lot with) and 15 mins for a good friend you have not talked in years.

(try to find a rough simple equation based on sociogram proximity)

Summary/context.

Announcing the context

At this point, the talk is NOT accepted.
A tad more negotiation may be needed.

It is cool for person to interact to have a short summary so that people can know if it will be "information" (asymmetric with a higher volume from the emitter) "communication" (symmetric), "advice" (asymmetric, but reversed).

Defaut is symmetric. Asymmetry is boring and if so you should think of NOT using IM.

Context: 

business related/real life related/balanced

Default:  balanced.

If you use IM for business related stuffs, I don't think this proposal applies to you. There are multiple ISO norms for handling support. People also tends to dislike doing free consulting in an interruptive way out of the blue. If you poke someone for asking him business related stuffs, you are probably asking for free consulting. Please, DON'T. There is no such things as free beers. If so you should propose clearly a compensation, even if it casual at the beginning.

Ex : Please, can you Can you give me 10 mins of your time between now and thursday on IEEE 802.1q? I will gladly pay you back a coffee sunday for your help.

Notice the importance of being polite. DON'T use imperative forms, they express orders. Use polite structured form. Give all the information in a single precise statement.

The more you are needing the advice, the less you should be pushy. It means you value this person much and you should not alienate her/his good will.

Default : Salamalecs (work in progress)

When greeting  each others you can't help but notice muslims/persians have an efficient advanced human protocol for updating news on a social graph called in french salamalecs.
http://en.wikipedia.org/wiki/As-salamu_alaykum

I don't know the religious part, but the human//cultural behaviour that results is clearly a handshaking protocol that seems pretty efficient.

I don't know how to transpose it yet in an occidental way of thinking, but I am working on it.

Receiver expected behaviour


People at my opinion tend to answer too much.

You have a life and a context. If you trust the person poking you, you expect him to know the obvious:
  1. you may not have time to answer;
  2. you may be dealing with a lot of stuff;
  3. it may be unsafe (either you are driving, or at a job interview)
  4. you may not be interested by the topic, but it does not mean you don't like the person.
Learn to not answer and not be guilty.

In the old days we tended to send an ACK to every sollicitations because network delivery could failed (poorly configured SMTP, netsplit....) and we could not know if the receiver was connected.

Today, we are receiving far more solicitations and we may forget about old messages.

If you did not answer, have faith in your interlocutor to repoke you in a graceful way. The x2 between every sollicitation is based on the law of «espérance»(find translation in english + reference) when having incomplete information about the measure of an event.
Believe me, mathematically, it is pretty much a good idea to make every solicitations if important spaced by a 2x factor (kind of like DHCP_REQUEST)

Once the topic/time are accepted, you can begin the conversation.
Content negociation SHOULD not exceed 4 lines/15 minutes (waiting/1st retry included). The speed of negotiation should give you an hint on the expected attention span of the receiver.
If you can't spare the time for negotiating DONT answer back. It is awkward for both parties.

Time agreement: When // for how long.


minimum time slot: 7 mins.

Experimentally it is good for better conversation, it makes you able to buffer your conversation in your head and be able to higher the bandwidth.

Using a slow start that is casual and progressively getting in the subject can be regarded as the human counterpart of old time modems negotiating for the best throughput.

You emitter is NOT a computer. Civility and asking questions about the context will help you adapt, it is not wasted time. It is clever to ask news that are correlated to the ability of your receiver to be intellectually available. Slow start means you should not chains the questions in one interaction.

ex: Are you fine? How are you kids? Is your job okay?

Multiple questions are NOT a good opening. Always serialize your opening.

Making a branch prediction with combined questions may give awful results.

What if the guy lost his wife and kids due to his tendency to workaholism?

Once the time is agreed you can set a hard limit: by saying : clock on.

It is cool to let the person with the busiest context tell the clock off.

It is fun to hold to your words about time. You'll learn in the process how chronophage IM are.

A grace time after the clock is off is required to close the conversation gracefully with the usual polite formulation. It should be short and concise.

Ex:
A :  thks, bye :)
B : my pleasure, @++

References: 


To be done

* netiquette (IETF RFC 1830?)
* multitasking considered harmfull
* something about RS232 or any actual low level HW protocol could be fun;
* maybe finding an outdated old fashioned book with funny pictures totally outdated with a pedantic title like «le guide de la politesse par l'amiral mes fesses» should be funny
* I really love salamalecs so finding a good unbiased article by an anthropologist is a must
* putting a fake normalization comity reference or creating one like HNETF could be fun: Human NOT an Engineer Task Force with a motto such as «we care about all that is way above the applicative OSI layer» to parody/make an hommage of IETF should be fun.
* some SERIOUS hard data to backup my claims (X2 estimations, concentration spans, ...)

TODO 


format that as a PEP or RFC
make RFC 00 for defining the RFC format/way of interacting to make that evolve
specify it is a draft somewhere
find an IRC channel for discussing :)
corrections (grammar/orthograph)
experiment, and share to have feedbacks, maybe it could actually work.
don't overdo it.
make a nice state/transition diagram
provide a full example with time line (copy/paste, prune, s/// of an actual conversation that worked this way).
add a paragraph about multi culturalism and the danger of expecting people to have the same expectation as you

EDIT : name it salamalec protocol, I really love this idea.


DevOps are doomed to fail: you never scale NP problems

We live in a wonderful world: all new technologies have proven that old wisdom about avoiding NP problems was stupid.

The Travelling salesman problem? (which is not NP, I know)
Well, can't google map give you wonderful optimized routes?

And K-SAT?
What is K-sat by the way?

SAT problem is the first problem to be known as NP. (wp, youhou)

What does NP mean in computer science nowadays, that can translate in word devops/business can understand?

It cannot scale by nature. 

Most devs reason as if we can always add CPU, bandwidth, memory to a computer.

The truth is the world is bounded. At least by one thing called money.

So here is what I am gonna do:
- first try to help you understand the relation ship between KSAT and dependency resolution;
- then we are gonna try to see roughly what are the underlying hidden problems;
- I am gonna tell you how we cheated so far;
- then we will show that the nature of the problem is predictably in contradiction with accepted actual business practices.

Solving the problem of knowing which package and in which order to install them given their dependency is NP complete. 


The more correct way to explain this homeomorphism is here.

So: K-SAT is about the generic problem of solving boolean equation with k-parameters where some parameters maybe in fact an expression of other parameters (solution of an another equation) (that are the cycles I will talk about later).

After all, boolean are the heart of a computer, it should be easy.

Seems easy as long as the equation is a tree. And aren't all our languages based on the parsing of an AST? Don't they work? So we could write a language for it.


Well, no. Computer manipulate data. A register does not tell you what the data is. Is it one variable of my equation, its name, its value, its relation with other adresses...?

The hard stuff in computer science is to make sense of data: to make it become information by handling the context.

Installing a package on a computer is in fact building a huge graph (40k nodes on debian) and when a package is to be installed you begin by asking the first equation
Ready_to_install = Union(dependencies == satisfied)
if false then we go to the dependency solving stage

For each dependency listed in the dependency (to the nth order)
build a list of package not installed that should be required.

Plan installation of this package with the actual solutions chosen (there may be more than one way to solve your dependency, so the equation as not one but potentially N solutions).

So ... you have to evaluate them ... recursively (because parameters are solution of other equations)... then stack them ... sometimes the solution is not good, so you backtrack to another solution, modify the stack ... and sol long.

And it's over?

Bof, not really.

What if package A is installed and package B requires A+1 and A & A+1 are mutually exclusive?  (small cycle ex: centos6.4 git requiers git-perl and git-perl requires git)
What if package B requires A, C , D. C requires E, E requires F and G, G requires B? This is a circular dependency or a cyclic graph.
In which order to install the package so that after every step all software still works? (I don't have the right to wipe software installed to solve a version dependency).

Where is the hic?

Variable subtitution can make the boolean equation impossible.

ex: A & B = True given A what is the value of B? easy True

The given equation is the desired state of the system package A and B should be installed.

A is True because package A is installed.

What if B = ~A ?

The equation is not solvable. Trivial case that normally don't happen.

What if B expressed requires C, D and D requires N and N is exclusive of A?
(Example A == apache, N == nginx and the software B requires nginx on 0.0.0.0:80).

Testing for cycle is easy given K determinated vertex. Finding how to check all the possibilities given all the N set of I partial solutions is quite more complex.

This is known as the DLL Hell!

That is called software requirements (see ITIL that makes a lot of fuss on this).

We already are facing small problems, but nothing that really matters. We have not talked about how we cheat, and why some are heading for a disaster.

Why do computer engineers avoid NP problems by the way?


The universe is bounded.

The data structure needed to solve the resolution dependency is a graph.

The edge are the packages (variables).
The vertex are the logical expression of software requirements (A.version > B.version)

So  before talking of algorithm just one basic:
in worst case, when you add a node to a graph with n nodes you add at least n-1 vertex.

Thus the number of total relations has grown more than linearly.

You still have to store the information ... in memory (for the work to be done fast).

Then, you have to detect the cyclic references. The first order are easy.

But not always. There are ambiguity in the vertices. A requiers B.version>1.1
and C requires B.version < 2.2 may conflict if B is only available in version 1.0 and 3.0. ... so ... there is much more than what the eyes can see :)

And cycle can be bigger than the usual classical 2 exclusive packages.

But that is not all.

The algorithmic normal way to solve the equation is to create the graph. And do the systemic evaluation of the cases.

The time of computing grows in worst case explosively.


But we are not in worst case: with my «blabla» OS it takes me 3s with 4k package to install and 3s with 41k packages installed


Well, we cheat.

One part of the cheat is not going for the exact solution but given known property of real world packages KSAT solvers are optimized.

We cheat even more by relying on human beings.

Maintainers in most distributions are doing an excellent job at testing, and fixing the bugs the OS users report and make very minimal dependency.
We are in a special case where the vertex are not very dense.

The algorithm seems to scale. But ... it can't... since we are changing the domain of validity of the KSAT solver we use. Optimization that relies on : sparse connections//few requirements per software.

DevOps problematic is not ONE computer. It is a set of computers with different Operating Systems. And in house developers that ignore what packaging is all about.

So you don't have one set of equations to solve your dependencies, you have n sets. And now, the requirements may link to other sets of equations :
exemple My python program on server X requires nginx on the front end Y.  

OOps, I don't have a graph of 40k nodes anymore, but 800k nodes now.
Do you want to compute the number of potential vertex with me? No. It is huge.

My sets of depencies has grown a lot. My input data in my algo have grown exponentially, so will my CPU time needed to solve the new problem.

if your apt-get install apache is 3 seconds on your ubuntu, your chef deployment will take you 3 minutes.

And, in real life, there are still people installing software from the sources without using a package manager (if that was not complex enough).

So your data are possibly not even accurate.

To sum up:
We are tending to :
- multiply the number of edges more than linearly;
- increase the number of vertices more than linearly
and feed that to an algorithm that takes exponentially more time given more input in the worst case and we tend to move towards the worst case.

The time and complexity is increasing very much.

Why old wisdom matters!

I tend to think the drawbacks of dynamic linking outweigh the advantages for many (most?) applications.” — John Carmack

The fashion for android and OSX is to prefer statically build application.  It diminishes the vertex in the graph a lot.  It diminishes the software requirements.... on the front.

But smartphones and tablets are CPU/IO/battery bound very much, so we deport more and more computing in a distributed system called the cloud.

And let's zoom on the cloud system requirements.

Since we exploded the resources available on one computer we are replacing in cache memory available to more than one threads to distributed in memory cache (memcached, mongo, redis...). We are adding software requirements. We are straffing/caching/backuping data everywhere at all levels.

Since we can't serve the application on one server anymore we create cross dependencies to higher the SLA

Ex: adding a dependency on HAproxy for web applications.

For the SLA.

So your standalone computer needs no 99.9% SLA when it is shut down.

But now, since we don't know when you are gonna use it, where you are, we have to increase the backend's SLA.  

By the way, SLA adds up.

My CDN is 99.9%
My heroku is 99.9%
Our's ISP is 99.9%
so my SLA is know ...between 99.9% and 99.3% yep, you forgot to add the necessary links between your CDN and heroku, and your customers ...

You need a 99.9% SLA. It is cool, it is your upper bound.

But you build a growing uncertainty for the worst case.

Or you could expect more SLA from your provider.

What is the SLA beast?


Service Level Agreement. The availability of a service over a given time on average.

99% SLA over one year ~= 3.65 days down.

Would you use still google/fb/twitter/whatever if it was down 4 day per year?

If you have a business 1% off on a critical service (like mail) you have 1% gross income less.

So ... our modern distributed technologies are aiming at 99.999%

Mathematically SLA is thus a decreasing function

And they are de facto based on increased requirements. They rely on an algorithm that is NP complete.

Mathematically resolution dependency is an exponentially time consuming function. And you are feeding more than linearly growing input.

So ....

Mathematically they are bound to intersect.

Just for memory: chef recommends 30 min per run // the equivalent of apt-get install on your computer that takes 3 to 45seconds.

These are

Availability per day per     month         per year
99.999%         00:00:00.4 00:00:26 00:05:15
99.99%         00:00:08 00:04:22 00:52:35
99.9%         00:01:26 00:43:49 08:45:56
99%         00:14:23 07:18:17 87:39:29

So, well... imagine a distributed deployment did happened bad, what do you think of the SLA?
And, do you trust people who says they never made any mistakes?

I don't say the days are near where this NP complete aspect of software deployment will bite us.
I say these days exist.
I say the non linear nature of the problem makes it impossible to predict when.
I say the phenomenon will be very abrupt due to nature of the phenomenon.
I say we are pushing towards choices that will create the problem.
I say business analysts, companies, CTO will not see it coming.

And that is my last point:

Our scientific education makes us blind to non linear problems

The first words of your scientific teachers that you have forgotten before teaching you science/math was: «most problems are not linear, but we only study these one because there are the only one for which we can make easily accurate predictions»

If you have a linear system you can predict, plan ... and make money. Without well, you are playing the lottery.


What are non linear stuff ?
- weather (weather forecast after 24 hours is still a scam even though our computers can crush even more data since 40 years);- actuariat/finance: selling products based on the probability connected problem will happen ; 
- resource consumption (coal, oil, fish, cows);
- biodiversity;
- cryptography (you search for symmetrical operations with non symmetrical CPU cost)
- floating point behaviour (a operator b != b operator a is not always true)
- economy;
- coupled moving systems in classical physics (randomness can be obtained easily with predictable system if you couple them correctly);
- quantum mechanics (it bounds the max frequency of the CPU);
- the movements of the planets (you can send me your exact solutions for where the moon will be relatively to the sun in one year (length) relatively to a referential made of 3 distant stars).
- internet bandwidth when bought to a tiers one;
- real life, sociology, politics, group dynamics .... 

You see a common point there?

We still have not solved these problems, and we do not learn how to solve them in our regular curriculum.

I don't say there is no solutions.
I say there is no solution. Yet. We will never find the solutions if we don't get aware of the problem.
Non linear problems are not a computer problem. They are an intellectual problem that requires proper thinking.
It requires education.

We are pretending to live under the empire of necessity, but there is no necessity to accept this reign.

We try to build a new world with the wrong tools because we are making the false assumption we can handle the problem we face with the methods we learned at school. We rely on the «giant's shoulder» to make the good tools. But, since we are not well educated, we invest money on the wrong tools for our problem. Tools often made by the right guys for often no actual problem.


Firstly, we should slow down our adoption of async/distributed systems;
Secondly, we should lower the SLA to reasonnable levels. If 2% of a service interruption in one of your production can kill you, your business is not reliable;
Lastly, we should understand how the more our systems is efficient the more fragile it is becoming.
It maybe the time to trade efficiency for durability. It maybe the time to slow down and enjoy all the progress we made.

The crushing of Big Data by the Dude: Ultimate Data!

(Hommage au big Lebowtsky).

I was enjoying leisure and was pretty happy with myself, but annoyed.

Annoyed by the non sense around big data. Big data does not measure in how big your data are but in how an heavily significantly simple data comes out.

Big data should be the ultimate data.

Now, let's try to do what all good programmers before coding.

Relax, close your eyes and imagine the better world for your ultimate customer.

Relax even more. Would I switch place with him?

Holy cow, yes!

But, I can't take his place. It sucks to be me. I want to be that guy. So I imagine I become the boss as a fraud. How would I keep my place?

I need something. Something like BIG data so that I can be easily, lazily make bucks. Big buck$. I need the most simple designed evilish software to ease me the trouble of working.

That's how you think. And you are right.
 What is it?

A something that makes the business alive. That I can read easily pretending I am a wizard and no one will understand.

Imagine a simple digital clock like device with one info on the wall. What would you put?

The actual flux of income/outcome without any artificial filters.

Imagine how trepiding it must be to have a direct "load" for your business.

Is there a bottleneck in your production? Your load will stagnate.Is it day or night for your customers? You will see if it matters.
Is it Christmas holidays? You will see if it matters on your load.

Is there a bug in a software that results in gains? You can give the coder a bonus to keep up doing the real bucks.
Is there a feature that results in a direct gross loss? Well. Even if it is correct, these people are gonna kill your business. 

It would be damn fun and exciting. Just wonder when a good news happens, you have to understand why.

You need to have flux of raw informations maybe coming from other channels (news papers for instance) and a timelime of your value to make the correlation correctly. 

You can actually read the results in real time. You could even have deconvolution filters to erase trends, saisonnal activities to improve the  results.
And like a load on a computer every scale time make a difference. But actually a skilled eye see the patterns fast so deconvolutions maybe overkill.

And, now, You too can be the big dude with big bucks, with only one skill: looking a figure. Being a boss is just a guy that pays attention to always have a positive cash flow on at least one time scale he controls.

You have to respect lazyness. So, you relax more, 'cause you are not Rockfeller and you have high standards. When you will be the boss, you want to go to Tahiti by cargo boat.

So you think of delegating. You are lazy, but smart.

 So you have to be able to delegate the coding without being coned. Because, if you hire another yourself, you know it is easier being well paid selling your boss IT craps than actually trying to provide real services.

Does it requires bigger data?

Nop it requires the ultimate big data: the direct results on the cash flow of every actions in real time expressed as a figure. Or anything that correlates it. One figure to rule them all: cash. We human are good at out of the box thinking. If I were having a hairdresser's franchise, I think I could correlate the water's consumption in all franchises with my activity with a certain confidence. That might me my ultimate data in this situation. Simple data correlated with my business. It won't be precise; I won't see the financial part.
But what about the finance if I lose my customers faster than my hairs?

It just require smarter ways of having datas. Growing the size of data in an uncontrolled way may not be the solution.

Smarter data may involve a little bit of an archeological domain called "empirical science".

It involves studying our math to sample a smaller set of data and be able to give results with their errors.

Adding errors to the ultimate data is also ultimate.

It enables a controlled tradeoff on cost over exactitude. Precision is about the quantity of information, exactitude is about the quality.

big data is precision.

But ultimate data is cost effective exact data handling with their level of confidence.

Not nice figures unchecked but labeled «exact». Less information that are more significant with their level of confidence that are reliable: data you can trust that now leveled up to the rank of information! You want a small information in real time: are my decisions good or bad. In fact, the trouble is this figure is a glasshouse. Everybody can also see the impact of your wrong decisions. A good tool should be dangerous, else there is no fun.

As soon as your IT teams will improve, then your error margins will improve (at the condition you cut some heads every time a value happened outside of the confidence values previously made and other precautions). 

If you are really paranoid add the IT related cost per customer on a side channel.

Nowadays, the data are growing exponentially with the size of your graph. And, also because of added dynamics, it grows more than linearly over the lifespan. And the more people uses messaging based system for the more operations, the more growth you add.

If we follow the reasoning then big data is just overwhelming bigger data: companies that buys are doomed. Data are consuming electricity to point the less. The OPEX are more than linearly growing by customers&providers. The bigger you become, the more vulnerable until you reach a lock in situation.

 Your interest is to control your cost per customers. You want to diminish your cost per costumer while your base of customers grows, not the opposite. If I were your competitor I would not contradict but rather encourage you.

Anyway, the ultimate big data: I will build it with my friends; for us to become the big dudes, relaxed, and having time to spare.  It can be done. You just have to relax and focus on what is the essence of the data, not on its accidental nature. The essence lies in simplicity, clarity causality in accord to your goal.

Ultimate data is the simplest tool for measuring in real time your success and failures according to what matters the most. No more, no less.

How FSF (and free software zealots) miss the point of Free Software

My physic teacher used to say:
why is a religious question. Actually the real important question is how; since once you understand the mechanism of something you can improve it. While wondering about the finality don't bring any useful answers that can be neither checked nor used to influence the system we observe.

This is my first attack on FSF: while I love the definition of free software given by FSF, I am very reluctant to adopt the ideological views of FSF.

Just to refresh memories, the definition of free software by the FSF is made of 4 freedoms that should be granted by the licence: non-discriminative usage, sharing, studying, modifying the code that should be provided when a software is released. It results in a single important property of these software: they can be forked.

This definition is accepted by everyone and is the reference (even for OSI), and some weired spirits says that the FSF licence is a tinge limiting the freedom of usage compared to others licences such as BSD.

The main distinction between Open Source and FSF (gnu project) is the finality.

Open source is considered a pragmatic or materialistic approach where softwares are viewed as economical externalities for which the cost of developing is so high given the available resources that sharing is just a mean to make the cost decrease while increasing the «frontier» of new problems to solve.
FSF thinks software is about freedom. The bits of code we share according to them have much more power: there are the foundations of a new free society to empower citizens. They are the partisans of an «intelligent design» of software.

For them, we face a competition of the bad proprietary software trying to enclose us in a technological lock-in due to the very nature of economy based on externalities; according to Chicago school, software/OS industry should tend towards natural monopolies: the more you use my goods the more my costs diminishes thus the more I will win money even if my product is crap as long as it is adopted.
FSF also thinks citizens can be free if they have tools to express themselves and see computer networks as the climax of the modern Gutenberg press.

To sum up: Free software is for the FSF a synonym for free speeches, and free society.

So in order to avoid the lock-in religious zealots focus on trying to provide alternative to «potential lock-in technologies» needed to build an independant functioning OS:
- kernel (Gnu HURD ;) );
- system (GNU bash, openSIP, GNU...);
- development (GCC, Glibc, Gnu ADA, mono, Guava, Gnu ...); 
- security (GNUTls, GPG...);
- desktop (Gnome..);
- office suite (Gnumeric :) );
....
It is exact to say actual linux distributions are using in order to be functional a lot of GNU technologies brought up by the FSF. Thus the claim of FSF one should not say Linux but GNU/Linux (the pronunciation is available in .au somewhere on the fsf sites and worths a good laugh).

While we live under the empire of necessity (externalities), it is wrong to praise the necessity; there is no necessity to leave under any empire may it be from the forces of «right». We should totally consider getting rid of most of the FSF sponsored software when they are harmful.

And that is my point of opposition with FSF. I kind of agree with their view, I strongly disagree with their way of trying to achieve their goals that for me is counter productive in terms of engineering and of education.

Hence the how vs the why approach.
   
FSF code alternatives to lock-in technologies without wondering if:
  1. they are competent in their field;
  2. these technologies are beneficial from the beginning.
GNUTls considered harmful, rants about how clueless the TLS team is about C coding and maintaining code. Since security is kind of like walking in a very dense minefield when the code looks like behaving like a drunken man from the most simple point of view of coding, then you don't trust the code.

FSF zealots will say: not a problem, by the shear property of openness and amelioration the code will tends towards better code.

Well, this is wishful thinking, crappy engineering even with good QA very rarely tends to be good engineering at the end. (bf110, fulmar ...)

And kaboom you walk on a mine: a bug in GNUtls made it possible for 5 years to bypass certificate checking. The NSA really must fear FSF claims that being convinced of freeing society makes software that rocks.

And still FSF is making a FUD (fear uncertainty doubt argument) on proprietary software dangers promoting «safer» free software... Please!  

Is GNU-tls an isolated case? Well, Gnome (for using
c# notably everywhere), gnumeric, gcc, glibc, mono... received a lot of criticizes for their engineering, and there are more. As Linus Torvalds says: all bugs can become security bugs, so the first rule of security is to adopt correct software.


Plus, some alternative are even worse than no atlernatives at all.

Office suite bringing to the computers all the confusion of mixing what you mean and what you see, and the stupid "paper" analogy of documents is harmful. People should focus on the content. Software are gifted for applying a lot of stupid rules: they are gifted for versioning, applying templates, typographic rules, having hyperlink, access control in a distributed environment. And we still use this bloatwares called office suites making you write on a virtual piece of dumb paper.

Our document papers designed by computers are when you know the rules of typography below what we can do with manual typesetting. Typographic rules are not for grumpy old men, they are a way to higher the speed of reading and comprehension of written documents. Yet with these awesome computers, we have documents that are pathetically less readable compared to what we could do.

FSF is by definition reactionary and conservative by «proposing alternatives» to adopted lock-in technologies.

Their ideological blindness make them back up wrong so called «new technologies» like sheeps. Maybe TLS is wrong. Maybe C is wrong, maybe traditions are wrong.  Maybe they are right. But I am sure blindly reacting to «proprietary lock-in» by quasi systematically proposing free software alternative is dumb. It sometimes helps the adoption of incorrect technologies.

Thus and here is my conclusion: FSF is harmful.

Against all evidences Free software zealots and fanatics are using the post Snowden era as a way to advocate GNU/free software. Saying that by property because people that do code are «good people» validated by a Political Kommissar inquiring their views they do «good software».

Well, fuck no.

openssl (which is not GNU), gnutls are below average crypto suites, with very harsh engineering critics since the snowden revelations.

I don't mean the proprietary alternatives inspire me much confidence (like RSA's stuff).

I say, we don't want «good or free» softwares. We want softwares that are well built and thus we can trust.

FSF says open source by enabling the «proprietarization» of software is evil.

Well, before windows adopted the BSD stack, their TCP/IP stack was much more vulnerable to sequence prediction attacks. They may have changed their stack since windows NT.
But what I can say is it is better the TCP/IP communication between to computer be safe. TCP/IP don't care about linux or windows or BSD, and one compromised computer make a lot of people unsecure.

So once more, I prefer windows to use open source software that is well engineered because it also selfishly helps me being safer.

And last and least: what matters in a software is not what it is said to do or any phantasmagorical values, but its correctness. FSF is just doing marketing for its own chapel chapel to which I don't belong in my name, and I strongly oppose it.

I am a dysfunctional small part of free software yes. I could even be rated a failure: I totally can live with it. But, still, I am part of it.

FSF should bear in mind it doesn't own free software or its values. Even a very insignificant developer as I have divergent opinions than theirs, like a lot of devs that code instead of writing stupid blog posts as much as I do.

Defining correctly the four freedom of software magnificently does not give FSF the right to pretend speaking or expressing the view of free software communities.

Collaborating on something does not imply we share a common view. And that is the freedom zero of free software they forgot: the radically non discriminative freedom to use free software whatever your opinions are.

And for all of us that are just «using» softwares for pragmatic reasons of having «correct» tools, when they push towards unsafe «uncorrect» softwares using a FUD on security, there is no way they cannot piss us off a little bit.

We don't need a unified free software. We don't need political strength. We don't need a wider adoptions of «free software» in security for instance, we need a wider adoption of «correct» approaches to security that will not be possible if FSF enforces the adoption of poor technologies by providing even more broken alternatives to proprietary approaches (based sometimes on the open standard they cherish like oauth2.0).

We don't need alternative to the desktop «à la windows or apple» we need correct desktop.

We finally don't need more «adoptants» that want free software everywhere, we just need more educated people that can make enlightened choices not based on fear but on understanding. And it would be great if they helped us find new disruptive approaches based on really new technologies to solve the old legacy problems left by crappy softwares and design some of them coming from GNU softwares.

And I am bored about their lack of culture in computer history: the first community aiming for users to be able to grow their own «vendor independent solution» is not born in 1984 with Stallman but the SHARE user group in 1953.

Please, you don't get credit for modifying history. You just looks like an Orwelian dystopic movement, or acculturated religious trapped in their closed mindset.