-
Suppose you have a cluster
And suppose you have many services on this cluster… you have n! problems.
Each service is likely to interrelate with other services in some kind of dependency tracking.
And you just broke a key service.
How do you solve this problem? How do I solve this problem? I’ve been thinking about this on about three separate levels lately.
Implementation - do you have some sort of heartbeat service? Are your failures discrete? Continuous? What do you do on failure? Do you have a database of running services and their pids? How are you polling for liveness?
Software/cluster - this is a very traditional problem in computer science . There is a lot of literature available on distributed systems. There seems to be a loose consensus in the solutions pragmatically available online: master/slave (perhaps, as my old advisor posed, this should be captain/sargeant?), master/master, and some fashion of failover, and gross redundancy via some kind of load balancer.
Abstract. Distributed failure exists across a multitude of areas: crystals, transportation networks, power grids, software cluster, neural damage, etc, etc. This seems to be an interesting problem. What do other - non-artifical - systems do to manage system level change/damage? I’d like to draw from those models to really bring a novel twist. Anyway.
Let me formalize the problem as I understand it today, basing this formal description from the language and capabilities of software.
Let there be a directed graph G of vertex/node (V) and edges (E).
Each vertex is associated with a failure model.
Each edge is a triple (src, dest, failure_model). I’m not sure we need a failure model associated with an edge - possibly if an edge can fail, it is better modeled by an intermediate vertex lying in between src and dest.
This can be easily seen to have two forms to view by (maybe more). Form one is a steady-state model giving a probablilty of failure: the network is considered to have a total probability of failure. Form two is a simulation view, where on each simulation step, the failure model for each node is executed and failures propagated along edges. This has a tremendous similarity to the McCulloch-Pitts model of neural networks, although I do not presuppose training the weights on the edges.
Let’s add a new spin to this: the supervisory node. The supervisory node can ‘unfail’ other ( possibly supervisory) nodes, as well as add nodes. This of course models some kind of monitor spinning up a new VM instance or restarting the process. To add a node, we ought to indicate some kind of VM server M - M is not a ‘regular’ node (or should it be? just one with an unusual service? Hmmmm).
For the full complexity, supervisors can communicate with each other and take decisions. It is practically possible/probable to have a “control plane” communication network to handle these sorts of communications. For this post, let’s just assume that we can add an edge to the graph that models the control plane.
More formally, we have a tuple:
(V, {Nodes, Supervisors}, M)
One idea that seems to be a useful framework is the Viable System Model. Suppose we make the crass and hard assumption that our cluster is a dynamic system. Then we have to introduce the systems required to maintain some kind homeostasis.
Taking the VSM, we have 5 systems to consider:
- Tactical - basic tasks
- Communications
- Policy/Process/Procedure
- External monitoring feeding into other areas
- Overall steering/guidance.
The next post in this ‘series’ will be my thoughts on how to put together some kind of VSM model on how to configure a software cluster.
Follow-up thoughts anyone?
-
Rusty code
C + ML had a baby, and its name is Mozilla Rust. I am honestly really impressed. Yeah, I’ve done some Haskell, some O’Caml, nothing big. But the thing about those languages is that you’re really wearing the hair shirt. Haskell improved my thinking about code - I think about code in a much more monadic way now. But it was so hard to actually write Haskell efficiently. As a practicing programmer, that matters a lot to me for a production language.
Key takeaways so far.
Rust mixes up mutability and immutability in a language-defined fashion. This means that you can visually tag objects as ‘can change’ and ‘can’t’. You don’t have to rely on self-discipline and remembering what you intended months later.
Generics. Coming from Python (what I use at the day job), actually having well-typed generics feels like a breath of fresh clean air. It’s like the difference between fumbling around without a flashlight and turning the overhead light on. This integrates tightly with the trait system.
Optional non-GC. I think that this is one of the really heavy duty cool things Rust has going for it. Deterministic software execution is a really big deal in a number of contexts, and not many modern languages/runtimes actually shoot for that.
Powerful typing. Oh man. Using typing in Java, C#, and C++ can be so painful sometimes. If you’ve explored other languages, you know what I mean. Rust improves on that so much. And the error messages are really sweet. Arrows pointing at errors are a great invention! (Was it Clang that invented that idea? Not sure..)
Of course, there are a variety of cons I’ve observed.
Con #1 is this is a research language, under development. So any code you write today might not compile in the next release. Bad juju for enterprises. If, however, I was burning the midnight oil on a research project and could afford to throw out code… I’d be thinking hard about Rust.
Con #2 is the three pointer types. I forsee that this will be one of the key newbie confusion points. The difference between &, @, and ~. :-)
I also don’t see any way to implement subtyping/inheritence. I’m not personally familiar with the idioms needed to work around that lack, so I would be concerned about writing OOish code in Rust. One of the things I’ve been using to good effect in the Day Job is inheritence to add new features without mucking about in the internals of an existing class; losing that capability would concern me. I’ll have to figure out how to model that ability somehow.
I look forward to exploring pure imperative programming in Rust in the future.
-
Common Lisp tutorial
Along with Yogthos, I’ve started work on a Lisp tutorial site for practising programmers. It’s intended to cover how to actually do modern development in Common Lisp. Things like IDEs, deploying, packaging, etc. Things that industrial programmers have to care about.
It’s going to have code examples - some open source license - and a decent amount of writing. It’s not about CL the Language - those resources are covered as it is. It’s about how to get systems deployed for real work… or at least long-lasting work!
Here’s the github: https://github.com/pnathan/articulate-common-lisp - go ahead and fork it! Contribute back!
This will be a real site as soon as we can get the first articles going.
-
Software law problems
The computer world has a problem with old law. It doesn’t work well. Pretty much, the old laws cover things like:
- theft, where goods are zero sum
- costly duplication
- wiretap lawsElectronic systems allow copying without a zero sum and with minimal cost. Information is wrapped and diced and sliced when in transit in many different layers. Stark contrast to the telephone with only 1 signal.
Lots of software elements have no analog in the physical. But we have to manage it in our laws and practice. And people don’t understand software- its a matter of specialty and education. But it touches all of our lives so much….
So there’s a big problem with software law. The problem of grasping both software and law is hard and takes a huge depth of understanding in both. -
Rework of Connect4, part 2
All the coding I’ve done this week’s vacation is Java 6, reworking my 7 year old code. Hoo boy.
It’s not very pretty. The code itself is relatively clean - well commented, reasonable variable names most of the time. Unfortunately, the architecture decisions I made were lousy. Total conflation of the UI and the data. I’m not sure why I did that. I knew better at the time. I suspect what had happened is Spring simply overwhelmed me, and I didn’t pay attention to the actual logic and architecture. Loosely what I had been doing was using the color of a given blank area (the tile) to actually be the title for the AI analysis.
Not only that, but the code on my MBP system simply doesn’t work. What I believe happened is that the code was written on a single-core single-processor system (an old Macbook G4), and when it was executed on my i7, multithreading bugs became apparent. I spawned a GUI thread, then when the GUI had finished loading, then I (via polling), started taking turns.
All that has been cleaned up. The Mercurial repository will be up on Bitbucket sometime, and I’ll point out some key problems in a later blog post, for the edification of others.
Coming back to Java after 7 years of experience in Haskell, Python, Lisp, Perl, and more C++ is a very interesting experience. This was my biggest Java project, and it was a very bad experience initially, as per my prior post.
What’s Java like (after a few days of work), as an experienced software engineer comfortable in more functional/less ceremonial languages? Well, both good and bad. It mostly reminds me of a type-safe Python without first-order functions. It’s remarkably procedural out of the box. Everything is very clear and KISS. The lack of first order functions means that my good friends MAP, REDUCE, and FILTER simply don’t exist. Hence, boundary errors become present. Further, it’s hard to compose functionality without some sort of factory idea, which is a shame. Java still seems verbose, after using Common Lisp. The ability to abstract simply feels limited.
Most of the code editing work has been done in Emacs, with occasional flips over to Eclipse to try and figure out what the big magic deal is with Java IDEs. So far, the debugger is unspeakably confusing to figure out. I can not figure out how to trap an exception and view the values on the stack… something that I figured in under a minute with Clozure CCL on the command line. Of course, my customized Emacs beats an uncustomized Eclipse handily for straight text editing. What I am keenly interested in, actually, is the refactoring support. I had it strongly recommended to me by a retired SW engineer now turned professor. So far, I’ve not been impressed… what I look to do isn’t supported. Time will tell…
-
Bad old school code
Seven years ago I was taking Dr. Soule’s AI class at the University of Idaho. One of the projects was a connect4 game.
I decided I was going to do a GUI application for this, and it would rock. In 2005, what would sound like a good idea was Java with Swing. It was… excruciatingly hard (at that time). Hours and hours work went into it, and I considered it a major triumph of work. Unfortunately, the UI isn’t…. totally bad, but instead of creating artificial intelligence, I created artificial stupidity. It seemed like a major piece of work, however. Lots of lines of code.
I’ve been thinking about different languages for a while now, and I’ve been wanting to probe Java/JVM work. Since this old code was such a major project, it seemed fitting to review and examine.
It was 1339 lines. Boy howdy, has my perspective changed!
I figure I will improve the AI and then rework the system to use Clojure and then ABCL.
-
Partition by count
One algorithm I frequently need is to split up a list into a set of sublists of size n. E.g., a resource that can only take up to n requests in a batch.
Here is a Common Lisp version of it, using the AIF macro from ANAPHORA.
(defun partition-by-count (sequence n) "Split sequence up into a number of sequences, each n long. If the length of sequence is not a multiple of n, the final result is short. Returns a list of the sublists created." (when (< n 1) (error "Passed in ~a; n must be greater than 0" n)) (let* ((seq-length (length sequence)) (leftovers (mod seq-length n))) (let ((main-body (loop for i from 0 below (- seq-length leftovers) by n collect (subseq sequence i (+ i n))))) (aif (subseq sequence (- seq-length leftovers) seq-length) (append main-body it) main-body))))I was showing a friend Common Lisp code the other day, and he remarked that it’s a very dense language. Looking at this function, I have to agree - there is no syntactic elements that can easily be removed. The meaning is there, to be grasped (or not). In a way I suppose that this relates over to Yegge’s Portrait of a N00b. Minimum ceremony, maximum semantic content. No comments; the docstring informally describes the contract.
-
cl-yahoo-finance at v3.2
Working on the protocol code for the more general querying API, I ran into a bug in cl-yahoo-finance.
Read-historical-splits was broken and I hadn’t noticed (!!!). I fixed it and sent it up to github.
-
Protocols and libraries
I am thinking about building a protocol for finance libraries; this protocol needs to wrap some other libraries and provide a consistent interface for the common results from these libraries. Ideally, each library doesn’t know about this hypothetical protocol; they can be developed independently.
Suppose we have cl-finance-queries, a generic protocol.
cl-finance-queries provide a method to gain information from an arbitrary endpoint library. The library doesn’t really know anything about cl-finance-queries. The last requirement really messes things up; it requires adaptor code to be created to transform the endpoint code into a final normalized form.
Suppose I require that a finance library supply a number of functions with certain names: cl-finance-queries digs around the namespace and all functions meeting the name are used. This lays an obligation on the namespace library and would almost certainly require creating adaptor code on the library’s side.
Suppose that I supply a number of defgenerics and require that a finance library supply a defclass with defmethods that meet the defgenerics. This almost requires an idea of a Connector object who doesn’t really store data, but only truly serves as a mechanism for type specialization on generic functions.
Still thinking about this. The specification for an endpoint library to not know about the protocol is probably sane.
My ideal code looks like this:
(ql:quickload :cl-yahoo-finance) (ql:quickload :cl-finance-queries) (cl-finance-queries:current-stock-data :cl-yahoo-finance "IBM").Note that this imposes no knowledge from cl-yahoo-finance of what cl-finance-queries is or does. I could as easily swap in some
:cl-google-financeand theoretically, magic would happen and the same answers would come back.However, perhaps Google requires a login token, so really what it will be is:
(let ((session-token (make-instance 'cl-google-finance :usename user :password password))) (cl-finance-queries:current-stock-data session-token "IBM"))This seems very reasonable as well: why should the protocol expect only a namespace, and not an object denoting a session?
Maybe configuring an adaptor class is the proper way to go about it, using this modality.:
- cl-finance-queries has a session class
- cl-finance-queries has a list of generic functions
- cl-yahoo-finance gets an extension cl-yahoo-finance-queries which implements the generic functions and defines a session class which inherits from the cl-finance-queries class.
If you feel like commenting on this, I’m on Twitter: @p_nathan.
-
Tests for uniqueness in CL sequences
One of the things I keep running into is needing a UNIQUEIZE or UNIQUE-P function in code.
Here are two different functions that implement the UNIQUE-P predicate. They perform loosely as one might guess. The hash table version trades memory for cpu cycles and performs faster, the list version trades the other way and takes less memory.
The hash-table implementation has an issue where it requires the keys to be hashable. :-/
I’d like to optimize the list version to be less than the square of n, but I’m not sure how to do that without moving into some sort of hashing-based thought process.
Happy Hacking
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; Tests for uniqueness. In my tests, the list-based function is dramatically ;; slower (20x) but takes 10x less memory on the worst case: 100% ;; non-unique. (defun unique-p-hash-table (list &key (test #'eql)) "Returns t if list is unique or nil otherwise" ;;This is O(kn). (let ((temp-table (make-hash-table :test test))) (loop for ele in list do (setf (gethash ele temp-table) t)) ;map it to t just to map it to something (= (length list) (hash-table-count temp-table)))) (defun unique-p-list (list &key (test #'eql)) ;; This is O(n^2) (let ((seen-already)) (loop for ele in list do (setf seen-already (adjoin ele seen-already :test test))) (= (length list) (length seen-already)))) (setf *unique* (loop for i from 0 upto 100000 collect i)) (setf *random-non-unique* (loop for i from 0 upto 100000 collect (random 10000))) (setf *non-unique* (loop for i from 0 upto 100000 collect 10)) -
cl-yahoo-finance at v3.0
cl-yahoo-finance has been upgraded to v3.0.
It now includes proxy capabilities and a function to get company information.
-
Lisp webapp work
Webapps for Common Lisp just got considerably easier this week with Heroku getting a Lisp integration setup.
Take a look at the sample Lisp web app.
I’ve spent a few hours working with it, and it works pretty good. Of course, I had to figure out a Lisp presentation engine for HTML, and I reviewed (and discarded) cl-who and cl-markup. They have a slick way of going about it, but due to macro rules (expand at compile time), I was having issues ensuring that content was being shunted into the right spot at run-time. This led me to Vana-templating, which did exactly what I wanted. Unfortunately, Vana-templating isn’t in Quicklisp (yet?), and so I had to kludge around to make it work.
I’ve been fussing around and have wound up submitting some small pull requests (Which were accepted, woo) into these projects to get things just that much better.
I did get it working though, so that was awesome. I look forward to seeing more Lisp webapps online.
-
cl-yahoo-finance getting a release soon
cl-yahoo-financeis going to be getting a release soon; I’ve got some work on the dev branch waiting, along with a pull request from riktam (thanks riktam!).This release will be adding proxy capabilities (unfortunately this changes the API slightly in breaking ways), and a
read-current-company-infofunction, which reads a new table in Yahoo.Looking forward for
cl-yahoo-finance, I don’t think that it has much more to do to be “complete”; what’s left seems to be cleanup of the results. After that, I anticipate that it will be in a steady state until Yahoo changes their API.At the next level of abstraction, it might be interesting to use
cl-yahoo-financeas a member of acl-company-infoclass, which adds in a number of generic accessors to compute information regarding the company. Also, a protocol could be developed so thatcl-google-finance(Or othercl-$company-finance) could be created and plugged in with the only difference being the make-instance call. -
Genetic programming
Working away on a programming project I’ve had off and on for a few years. It uses the principles of genetic algorithms to come up with a solution for an approximation problem. Unfortunately, I’ve completely forgotten how the code works. For some reason, I thought globals were a good idea, as well as not documenting the entry point.
What it will do, when I get it working again, is, given a relatively arbitrary test vector, come up with an approximation function for that vector, along with an error estimate. It uses Lisp’s symbolic capabilities to form up functions and evaluate them on the fly.
It’s pretty cool stuff. I think I’ll work it over to a point where it can read a test vector from a CSV and generate an approximation, document the code, add a readme, and call it good for now.
-
Nifty special variable trick
This allows overriding special (dynamic) variables conditionally. I picked it up from cl-csv. It will create a local binding of the dynamic variable.
;;; Dynamic variable overriding. (defvar *ursine* 'pooh) (defun return-ursine () *ursine*) (defun bearable (&key ((:bear *ursine*) *ursine*)) (return-ursine))