Is Smalltalk a Functional language? Is Lisp Object-Oriented?

Alan Kay says there are only 2 Object Oriented languages that he knows of: Smalltalk and Lisp. The deeper I go into the history of Smalltalk, the more functional Smalltalk looks.

Transcript

Eric Normand: Hi. My name is Eric Normand. These are my thoughts on functional programming. I have kind of a weird topic I want to talk about today. Has to do with Smalltalk and functional programming. The more I read about Smalltalk and its history, what early programs in Smalltalk looked like, the more it strikes me as a very functional language.

The abstractions, the little data structures that people created are defined in a very recursive way. I think the paper…the early history in Smalltalk, there is a linked list that’s created that has a little object that has value and then a pointer to the rest of the list.

The method link is defined by adding one to the link of the next, the rest of the list. All the methods are short like that. It’s just some very simple, concise, code that defines this whole linked list. The whole interface of the linked list including the implementation is a note card with the code.

A number of other things where Alan Kay has said…There is a number of other things that has made me think that when we talk about object-oriented programming, what we now call object-oriented programming has missed the point.

That what Alan Kay tries to emphasize is the stuff that is much more like what I’m calling functional programming. Which is kind of ironic. Alan Kay has said that there are only two object-oriented languages in the world — Smalltalk and Lisp.

Lisp is not traditionally thought of as an object-oriented programming language. Does that mean that our definition of object-oriented programming is just different from his and he is the creator?

Obviously, it is different, but is it that different that he would classify Lisp as object-oriented and almost no one else would? The fact that he has a lot of respect for Lisp as an idea…that he was inspired by Lisp to create Smalltalk.

I think that object-oriented programming and functional programming have a lot more in common than we like to think about. The modern practices of organizing these big classes with lots of immutable state and all that stuff that you would in a job of program.

I think that that is not really what, I’m going to call it true object-oriented programming, is all about. Creating these little abstractions that have little interfaces with nice recursive definitions — which is much more like functional programming — that is what the original definition of object-oriented programming was supposed to be.

I’m going to go and say it, I’m going to say it, Smalltalk is a functional language. If Lisp can be object-oriented then Smalltalk can be functional. That’s my short idea of the moment. I do think that the good practices in both of them converge.

That you should be thinking about making your…like in functional programming we think, you need to have short functions in object-oriented programming. You need to have small objects with short methods.

They all kind of converge onto the same principles. There is just good programming at the end of the day. Now, what’s different is the terminology, the ideas, the practices, the approaches to problem solving. But at the end of the day, good code is good code.

If you’re going to make a small or recursive definition of something that should work in both paradigms.

All right. My name is Eric Normand. This has been a thought on functional programming. A weird one about Smalltalk, functional programming and object-oriented programming. Thank you very much.

I’d love to hear your thoughts on this. If you go to Twitter, I’m @ericnormand. You can also email me, eric@lispcast.com. That’s my email address. Love to hear from my listeners. Please let me know, if you disagree, if you agree, anything.

If you like it, if you don’t like it, let me know. All right. See you then. Bye.

The post Is Smalltalk a Functional language? Is Lisp Object-Oriented? appeared first on LispCast.

Permalink

Default data reader function and tagged literals

Clojure has always avoided including reader macros with the goal of making all code readable by all users of Clojure, without requiring optional code. It’s a small constraint, but on balance I think has been a good one.

However, Clojure 1.4 took one small step in allowing users to define reader literals, custom data structures built as compositions of other readable data, marked by a tag. For example, there are a couple of built-in reader literals for #uuid and #instant.

Clojure 1.5 made some additional enhancements to add more support for a default data reader function in the dynamic var *default-data-reader-fn*. This is a function that will be invoked if no data reader function is installed for a tag, sort of a “tag missing” function. By default, it’s not set and most people don’t use it.

However, it’s pretty useful to have a system that can generically handle (and pass on) any reader literal it happens to encounter. While you may not know what to do with it, perhaps the next process down the line does.

Clojure 1.7 added support for a generic TaggedLiteral data type, embodied in the functions tagged-literal and tagged-literal? (it’s really just a simple deftype). As it turns out, the signature of tagged-literal is exactly the same as the function expected for the *default-data-reader-fn* (it takes a tag and form and returns an object).

To get this ability to read, print, and pass along any tagged literal (even if not registered), you just need to connect these two pieces:

(set! *default-data-reader-fn* tagged-literal)

The object produced in this case supports the ILookup interface for keyword lookup with the keys :tag and :form, so you can interrogate the unknown tag as well. TaggedLiteral has built-in print support.

Here’s a bit longer example:

;; Clojure can't read tagged literals without a registered reader:
user=> #object[clojure.lang.Namespace 0x23bff419 "user"]
RuntimeException No reader function for tag object  
clojure.lang.LispReader$CtorReader.readTagged (LispReader.java:1430)

;; Set tagged-literal to be the default tagged value reader:
user=> (set! *default-data-reader-fn* tagged-literal)

;; Try again
user=> #object[clojure.lang.Namespace 0x23bff419 "user"]
#object [clojure.lang.Namespace 599782425 "user"]

;; Now it works, and reads to a TaggedLiteral object, which
;; supports ILookup on :tag and :form keys
user=> [(:tag *1) (:form *1)]
[object [clojure.lang.Namespace 599782425 "user"]]

You might ask, wouldn’t this be a useful default default data reader function? Indeed it might….

Permalink

PurelyFunctional.tv Newsletter 279: Clojure web frameworks

Issue 279 – June 18, 2018 · Archives · Subscribe

Fellow Clojurists, I salute you!

I’ve been doing some research while building a guide to the Clojure web ecosystem.

There has been a lot of development in this area over the years. The web is an important platform. Clojure ideology maintains that we should build solutions from small, composable parts. I believe this is still the best choice in general. You build your application on top of Ring (or something Ring-compatible).

Even the most popular framework is built on Ring. So it makes sense to learn how to build an application yourself, even if you’re using a framework. My course Web Development in Clojure teaches Ring from the inside out. By the end of the course, you’ve learned to put together an application using handlers and middleware.

There’s a sale going on now. You can get 25% off all courses. Just use the code rainy_day at checkout.

There are some frameworks that, for their particular use cases, give the ideology a run for its money. Some of those are listed below.

I can always miss something. Let me know if I’ve forgotten something. And I’d love to hear your opinions. Just hit reply.

Rock on! Eric Normand <eric@purelyfunctional.tv>

PS Want to get this in your email? Subscribe!


Brand new

Datomic Ions

Datomic Cloud now has web story. You can run Clojure code, including Ring handlers, right in the same JVMs of your Datomic Cloud cluster. It’s not really a framework, but it opens up a lot of options for those running Datomic Cloud.

Rich Hickey talks about it on the Cognicast.


Frontend

Re-frame

Re-frame is a frontend application framework written in ClojureScript, built on top of React. It has no explicit backend story, but that falls neatly in line with the Clojure ideology of composing pieces together.

Re-frame is my recommended frontend framework. I’ve got two courses on Re-frame.

Understanding Re-frame is a high-level overview of the framework. It teaches you all the parts, but more importantly, the thought process behind building applications within the framework.

If Understanding Re-frame is a high-level overview, Building Re-frame Components is Re-frame in the small. Instead of a high-level overview, this course gets into the nitty-gritty of building interactive components and workflows.

All courses, including these two, are now on sale for 25% off the normal price. Just use coupon code rainy_day at checkout.


Keechma

Keechma is a frontend framework also based on Reagent. It looks pretty cool, though I’ve never used it.


Backend

Luminus

Luminus is a set of libraries pre-configured and put together to get you started quickly. It has an active community. The creator of Luminus, Dmitri Sotnikov, has written a book about Luminus, which is an extended tutorial.


Arachne

There was a lot of buzz about Arachne when it started as a Kickstarter project. But I think Timothy Baldridge’s explanation is the best. I’ll leave you to read it.


What a Clojure Web Framework Might Look Like

I’ve been thinking a lot about Clojure web frameworks. I recorded some of my thoughts about the matter.


Duct

Duct is a framework from Open Source Clojure celebrity James Reeves. Duct is an application framework with modules for web programming. It’s based on configuration data like Arachne, with a reloadable development workflow.


Coast on Clojure

Coast on Clojure is a complete framework, including routing, database connectivity, and Hiccup views.


Front and back

Fulcro

Fulcro is the best example of a front-and-backend framework. It tightly integrates the ClojureScript UI with the backend. There’s a lot of hope in the community about this one. I’ve never used it, but I’ve heard many good things.


Hoplon

Hoplon also has a frontend and backend story. This one is interesting because it doesn’t use React.

The post PurelyFunctional.tv Newsletter 279: Clojure web frameworks appeared first on PurelyFunctional.tv.

Permalink

Why do we need a Theory of Functional Programming?

Though I have gotten good reception for the theory in general, a few people have asked me why we need a theory. More people have told me I’m complicating Functional Programming, which should be a simple idea.

Transcript

Eric Normand: Why do we need a theory of functional programming? Why is it important to develop this whole thing that I’m doing right now? Hello, my name is Eric Normand. These are my thoughts on functional programming. As you may know from previous episodes, I am developing a book called “A Theory of Functional Programming.”

An idea is to set out the definitions and relationships in terms and ideas of functional programming, write them all down and then carry them to their logical conclusions. People have asked me why I think it’s important. Some people as like, “Oh, yes, we definitely need that.”

Some people have questioned whether functional programming isn’t much simpler than I’m making it out to be, that I’m complicating it by talking about…Some people say that functional programming is really simple. It’s programming with pure functions.

I’ve talked a little bit about why I think that’s inadequate already. Just to summarize it, what about everything else? What about data? It’s not sufficient as a definition. If you’re just programming pure functions, that’s lambda calculus. That’s not what we do. If you have data, you’ve already broken your definition.

We need to talk about data. If we’re going to talk about data, we might as well talk about side effects. Also, functional programmers do talk about side effects. Let’s not even talk about them programming. They have a definition of side effect that they use, and other paradigms don’t.

If you read a book on object-oriented programming, they don’t talk about side effects. That’s not part of their thinking. They’re talking about messages, classes, and has-a, is-a, all that stuff. Side effects are somewhere else on their radar. Maybe they get to it like the command queries separation thing.

They’re not thinking about mutable state, and global state, stuff like that. I mean, they’re thinking about it, but it’s secondary to their paradigm, whereas it’s primary to the functional programming paradigm. Someone should write that down.

The most common question I get is like, “Isn’t functional programming just much simpler than I’m making it out to be?” The other question is, “Why do we even need a definition?”

The answer that I have to that is that if you look at the object-oriented programming — redirect the discourse about it — they have all these rich books that have defined terms, that have laid out the best practices, the worst practices, some common ideas, patterns, idioms, all these things.

When they talk about it, they have this common lexicon. It’s a wonderfully rich language that they can use to discuss code, and to teach people how to code, when they’re teaching. We don’t have that in the functional programming community.

All of the books that I have found on functional programming are either about a specific language — it’s mostly about the language — or are highly academic. They are talking about a category theory, lambda calculus, or something like that.

There’s not a book that’s meant to start the discussion about what it means to do functional programming, what are the important ideas, what are their definitions, how do we start taking those ideas, and building bigger ideas on top of them. Just lay out the field.

We don’t have that. I want to do that. I’m poised to do that. The theory that I’ve been talking about has the potential to do that. I’m crossing my fingers that enough people will care for that, that it will have some impact.

This stuff is out there already. What I’m saying is I’m not bullshitting. I’m not making this all up. I’m merely talking about what I’ve seen, what I’ve read people do, what I see in code. I’m trying to codify it into something coherent. I hope it is the start of something.

I would love for people to take the ideas in my book, run with them, and say, “If we apply that in this domain…” What’s an example that I can give? You have the book design patterns. It’s like a seminal book in the field of object-oriented programming. Whether you agree with it, or not, l don’t want to get into that discussion.

This book has given people a vocabulary to talk about common patterns, and show people what is possible at a higher level when you’re doing object-oriented programming. You don’t have to use for-loops. You’re not using an object, or a class as a C struct. You should treat it like something higher level.

You have a visitor pattern. Instead of using a for-loop, and going through a link list, you should have some interface on your list, so that you can do a map over it. This is the thing that that book talks about. I don’t know if people actually read the book. They all know the book. They know what design patterns means in general.

Other people can write a book called “Design Patterns for Legacy Systems,” and “Design Patterns for XYZ,” and “A Look at Design Patterns in…” They can write up these other books. They don’t have to establish the idea of design patterns.

In another book where it is talking about object-oriented programming, you can reference the visitor pattern, the interpreter pattern, because someone has defined it. We need a set of books like that, or at least one book to get that started as industrial functional programming, not academic.

Academic is great, but then they’ve already got their books. I want something for the industry that sets it on a good course.

All right. That’s why I’m developing the theory. That’s why I’m writing a book. These ideas need to be talked about. We can’t just rely on these super simple definitions that don’t actually cover what they claim to cover. They’re just pulling into this middle academic world, where it’s actually a virtue to have a simple definition like, “We just use functions.”

Pure functions. Can program with only pure functions. It sounds great in an academic context. It’s great in an academic context to make your definitions smaller, because then, you can control it. You can know exactly what the consequences are to meet.

In layman’s industrial expert context, where things have to get done, you’re not looking for something to be picked apart by peer reviewers. You’re actually looking for something useful that can bring it to the next step, to the next level of understanding. That’s what I’m trying to do.

All right. I’m sure you disagree with me. You need to tell me about it before I write this book. Tweet me at @ericnormand, with a D. I’m also eric@lispcast.com over email. Yes, I’d love to hear from you. Let me know. I’ll see you then. Bye.

The post Why do we need a Theory of Functional Programming? appeared first on LispCast.

Permalink

A hidden message in Cognicast podcasts

Being listening to Cognicast, a series of Clojure-related podcasts, I’ve found a funny thing. Not sure if anybody has ever mentioned it before across the Internet. Each podcast starts with a jingle sound that follows right after announcements and a long pause. And in the end, there is also another weird sound that finishes the issue.

Those two sounds are borrowed from the Gauntlet 4 video game. It is a great fantasy game where one of the four characters travels through mazes full of monsters to encounter a dragon at the end. I’ve played this game on Sega Mega Drive (Sega Genesis in the US) for a long time being a kid.

This is the first sound. It appears when a hero picks up a key:

And this is the second when a hero goes through a portal:

I may guess there is a hidden message here: before entering a podcast, you pick up a key. And finally, you exit from a maze with the victory.

The idea of that came to me when I found myself singing a musical theme from the game right after the podcast has finished. I’m not sure if I’m right, but this one is really a great message. Thanks to those guys who are responsible for audio production.

Bonus: listen to the soundtrack. I hardly can imagine it was written using primitive 8-bit midi-driver.

Permalink

Kee-frame controller tricks

The power of controllers

The controller rules, taken from keechma, are centered around pure route data. That's really powerful, as you can use mostly plain Clojure to get what you want from your controllers. Often the solution is dead simple, but it's not always easy to spot. This is a guide to the most useful tricks, it will be expanded as new ones appear!

How to trigger an event once, at startup

So you want something to happen only once, but immediately. Things like:

  • Fetching initial data from the server
  • Start a polling loop
  • Letting the user know we are loaded and ready to go
  • Forcing the user to express her never ending love for cookies.

What you need is a :params function that triggers start when invoked for the first time, and then always returns the same result as the first invocation. That sounds like a plain Clojure function that we all know:

{:params (constantly true) ;; true, or whatever non-nil value you prefer
 :start  [:call-me-once-then-never-again]}

How to restart a controller on every route change

Maybe you want to store a trail of breadcrumbs, maybe you want some logging done. Either way, this one is also quite simple.

For this case you need a :params function that returns a new unique result for every new unique route data. What tiny but familiar Clojure function could we use to achieve that?

{:params identity
 :start  [:log-user-activity]}

Getting weird

As you can see, most things with controllers are quite simple and use plain Clojure. Just for fun, let's have a look at a controller that triggers randomly:

{:params #(rand-nth [nil % % %])
 :start  [:will-receive-the-route-data-quite-often-but-not-always]}

That's it for now. Please report your useful tricks at kee-frame issues and I'll add them here!

Permalink

Code Hibernation and Survivability – George Kierstein

I’m excited to welcome George Kierstein to speak at Clojure SYNC.

We’ve got a problem. It’s called bit rot. We have some working code. We don’t run it for a few months. And when we try to run it again, it doesn’t work. How did that happen? The system it was running on changed enough (packages upgraded, libraries deleted, passwords changed, permissions, hardware, IP addresses, ……..) that it just doesn’t work. The only way we know how to keep software working is to pay someone to run it all the time. It’s a bit like in the Dark Ages where you had to have monks copy books to keep the skill of reading and writing alive, and also to stave off the natural entropy (fires, rotting, losses, etc.) that would befall the books. There has to be a better way.

Here’s George’s abstract:

Can we store code and expect to ever get it to run again? Perhaps for a year or two but what about 10? 20? How we can begin to reason about the problem domain? Taking a page from climate science, this talk explores models to better frame the problem which could provide insight into how to design systems that can outlast us.

George Kierstein will talk about this problem and a solution she came up with to mitigate some of the challenges with archiving software for the indefinite future. We’re spending more and more time creating software. If it’s so important, maybe we should think about how to keep it.

Video

Please forgive the poor audio quality. We had technical trouble with the audio recording and had to use the backup audio. Please use the captions (CC) in the video player.

Slides

George Kierstein - Code Hibernation and Survivability

Download slides

Transcript

Transcript: George Kierstein talk
 
Eric: My next speaker, I invited her because I’ve had this experience where I’ve written some software and put it aside for six months. When I came back to it, it didn’t run. It didn’t work.

Digging into it, I realized my whole machine had changed, updated, things were different. It didn’t run anymore. That was only six months. We’re spending years of our lives building software. If stuff breaks after six months? We’re putting years of our lives into this. How can we let this happen? I don’t want to give the talk away. Please welcome to the stage George Kierstein.

George Kierstein: Thank you. Hello. My name is George Kierstein, and the work I’m presenting and the talk in general is based on my time at the National Climate Data Center, which is now the National Centers for Environmental Information, which is the nation and probably the world’s archive and main research center for climate-related topics.

The talk also gave me great opportunity, like Will Byrd did, to nerd out about some problems I’m really fascinated by, one of which is something I’ve coined as a generational-scale problem.

What’s a generational problem? A generational problem is a subset of these terrible and difficult problems in which the time scale and the time course of change is slow enough that we are cognitively blind to it.

We therefore have no intuition in which to ground how we’re going to approach these problems. They’re really tenacious and very difficult to solve. I like to call them dragons all the way down.

The other part that I find really fascinating is how the origin of civilization and what it can teach us about the evolution of organizations. Let’s get started.

Hibernation. It came out of a project that I was assigned to do, which seemed to the people who gave it to me kind of simple.

The story is, someone was about to retire. In fact, I have notes for this. Professor [inaudible] started this project in 1995. A couple of years ago he was about 80 and he was retiring.

He was the main contributor to one of the biggest and most important climate data models called the Global Surface and Atmosphere Radiative Fluxes model — ISCCP. There’s a lot of climate data models, but this one is what most of them are built on top of. Satellites measuring how dense the cloud cover is, among other things, all over.

A satellite database. He’s been refining this model since 1995 and he’s retiring. The next version was due out, but the satellites haven’t been launched yet. What do you do? He has this incredibly large code base that basically only he knows. Grad students, other contributors, sure, but it was his baby and he needed to retire and put it down.

This is a pretty common scenario in, I think, generational-scale problems. Some of the code bases we’re building for climate data research go back over 20 years, easy. They thought, “Let’s figure out a way, (and it’s your job) to figure out a way to put this away so it’ll run really great.”

In some sense, I thought, “How hard could it be,” knowing that the short answer is, “Of course, it’s not really possible.” That example’s great. Can we really expect that at this time? No.

But some of the things that were really interesting that I thought I could base some kind of solution on was some of the constraints. Constraint one is not the Will Byrd time scales. We’re not talking about 5,000 years. Maybe a couple years, maybe a decade. If we’re really lucky, decades, or unlucky, as the case may be.

Some of the things that pose real problems here are that the original author is not available. When someone’s set to bring this back up to life, there is no domain expert to handle it.

They are pretty much on their own. The software is ostensibly a black box, and for our organization it was also no VMs or containers. That was at the time a technological constraint.

But I kind of think if you step back and look at the problem, that VMs and containers really are kicking the can down the road, and in some way hiding complexity from the person reviving this that is worse, not better. That’s a debate we can have later.

Given all of that, I came up with, essentially, a protocol. Instead of any sort of software, it was a simple set of directives about how to package things up and what to include in this package that I thought would give the person reviving this the best chance they had.

It’s a human-scale solution. There was no attempt to try to think of “What good automation can I do that will solve this problem,” because in a very real way, again, it’s dragons all the way down. People have never really tried to do this before.

The chances that an organization that is underfunded and not technically as sophisticated as others are going to really understand any kind of automated solution, that would have to be maintained too, seemed not possible.

The best we can do with this type of problem, I think, is to come up with a human-based solution. The human-based solution basically leveraged, and I’m not actually going to tell you the details of this, because I think in reality they’re not that interesting.

The basis of my talk and what I’m going to try to argue here is not for any particular solution, really. It’s whether we can reason about this problem better than we have, which is to date really not at all.

Comprehensibility seemed like a fundamental principle and I kind of took the approach of my future self. Everybody, pop culture — think of your future self. This has happened to me in practice and probably to many of you.

Luckily for me I was at a company a really long time. Years later, I had to fix these bugs and look at very old code. I got really upset about the design decisions that were made and the styling until it sunk in that I wrote all this code. I forgot.

I had no idea. I looked at it fresh, and I was mad until I knew it was me. The future self model is not a bad one in this case.

Comprehensibility is really important. Can you even comprehend the basic structure of the layout of this thing that you’re trying to get to know and get to work again?

Simplicity in that sense is essential for comprehensibility. How else could you try to figure it out? If it’s super-complicated all the way down, this is going to take a lot of time. You might not be able to do it.

One thing has been referenced in a lot of talks and kind of a general theme, I think, is context is important. Because context really matters, so we force people to frontload as much information as possible.

Go through great lengths of documentation and architectural decisions that had been made that they never wrote down in order to provide as much context as we could for whomever had to get this to run again.

That was the main thrust of how I approached this problem, which led me to feel pretty good about myself. I actually won an award for my protocol, which was kind of ridiculous, because essentially it was just “Please, have good hygiene. Do things sanely. Be sympathetic to whoever comes next.” Yet that was still a novel idea at that time and in that organization.

This actually brings us to the bulk of the talk, which is, well, that worked out well for them, we got his skiff put down. We did the test to make sure we could revive it at least once with someone other than me in production, and that went as well as could be expected, which took a month, by the way.

Which tells you something about the time scale. Even in a very controlled circumstance, when we literally had the guy who knew how to get it to run sit down with me and do things the minimum we needed to to revive this code and put it away. It still took somebody a month.
Organization, like dev ops took weeks to respond to us. Dependencies were missing that we didn’t even notice at first, with the way they set up their deployment machine versus our development environments.

That’s pretty sobering already, and speaks to the level of complexity and how bad or how difficult this problem space really is.

Of course, my intuition when they said, “Hey, now that we’ve got this fancy thing, we’re under budget and we don’t know where to put all our data and we have like 70 new datasets coming in.

“Why don’t we — brilliant idea here, just brainstorming — why don’t we just replace all of actual data that we take so much time to have provenance for in the Dr. Sussman provenance way with the program and its inputs? It’ll save us so much time, so much space. It’ll be great.”

Of course, horrifyingly, “No — no” was the response. Trying to articulate “no” with something more reasoned than “don’t get me to do it”. I’m not going to help you. And hand wave no. It kind of led to a lot of the rest of the top which to next so can we do better than just say “no”. Let’s give it a shot.

So I’m gonna leverage some points of inspiration from sort of the generational problem perspective and the historical perspective of how civilizations have been formed and how they organize themselves to solve big problems.

So let’s do what everybody does. Let’s go back to the Newton age. When we have an awful problem we don’t know anything about, let’s just define it so we’re gonna say survivability is s.

Let’s try to constrain the problem space a little bit. Because survivability of any kind of system and software systems, in a very real sense, care conceived of as an ecosystem that sort of lives and breathes and thrives and dies. There’s a lot more to it than we can really lay out and tackle all at once. But there is some part of it that we might be able to come up with a better model to actually reason about with. I call that resilience.

So what do I mean by resilience? In the sense that software is flexible and adaptable, resilience is this property in which you know is resistant to change or can cope with change well. So resilience in my mind and the intuition is inversely related to how fragile something is. That’s almost definitional from the English language. Let’s call resilience the inverse of Fragility. So now we’ve got a model. We’re on our way. So I’m reminded of Tim Gunn who says “steal from the best and make it your own”.

So industrial engineering, for example, is a field in which they’ve really looked at this problem of how you go about looking at how things break systematically in some kind of production system. Now of course emotionally, I hate this and I’m like well, I’m not an industrial engineer. I’m kind of a special person just like all of you are above average. I’m an above average programmer like everybody else. And so I didn’t like this very much mostly, but let’s give it a shot because they’ve done a really good job.

So they kind of have when you pick up an early text on industrial engineering, they talk about defect rates. They have nice models and they say, “okay, we can model a defect in a system like so”. Kind of like Dr. Sussman’s intervention of skew, I defined it as drift. So how fast and far does the system–the parts of the system, loosely–that caused things to break in this respect drift?

In our problem space for this type of problem where we put code away and it’s well-defined, we bring it back up and we just expect it to work. We avoid a lot of the other kinds of problems like adding things or what happens when we send new inputs to it. Since that’s not our concern at this point, this is really drifting dependencies that break. Like what really costs these systems to fall down. Well in practice, it was the dependencies. You know the algorithm or the nature of the thing itself, that didn’t change between the time we put it away and when we’re pulling it back up again.

So let’s just call them D. That’s Drift and it’s about dependency drift. Maybe if we’re lucky, we could figure out eventually a statistical model. Now of course, that seems ambitious and dubious at best right? Because who knows if any kind of simple statistical model, like Poisson distribution, which is kind of the baby steps of Industrial Engineering textbooks, has any application of what we’re talking about. In fact, maybe a stochastic sort of statistical approach might be a better fit and possibly even acknowledging that this is really a complex system and we have to start talking about the phase spaces of its stability and all the rest of that.

There are some interesting books on the relationship between stability and complex systems of computability. So who knows right? But, given that on some level, like my Game of Thrones book club’s house motto. “I do not know”. So we might as well start with something simple that we can reason about and try to put our intuition into a form that will be more useful and argue about it later. So let’s call this resilience now a proportional relationship to the inverse of the drift of our dependencies, which for our problem is more or less describing the dependencies break and the chances of that system being resilient is inversely proportional to how bad and fast they break.

Well, let’s start to put some nuance to it, which is the magnitude. We kind of don’t want just the number of the potential threat breaking. We want to characterize also how bad this is going to get. So the magnitude of impact here is where what we define as that.
And then here’s my Papers We Love moment. Can we put a bound to that magnitude right? Well there’s this great to an online paper called the mathematical limits to software estimation–which I love.
He had a background them to see in CMMI. So did I interestingly. And I’m not going to get into what that is, other than that it’s kind of eighties level, first wave attempt to put any kind of systematic reasoning around and modeling for software estimation. We kind of would put it aside exactly places like the government and we’ve kind of gone with the hand wavy estimation by analogy. But this paper uses Kolmogorov complexity to try to analyze whether or not that’s even sane or possible. and the upshot here that I want to get at is then, ostensibly, no it’s not.

If you’re doing an estimate, your estimate can be off by as much as rewriting the whole thing. Which is a sobering thought. In a lot of ways kind of harkens back to (I love referencing people because everyone’s been so great presenting things that are super interesting) Kim Crayton, talking about the myths of programming and how we should change our culture to be science-based about it. One of the myths that I had a whole rant–that I took out of this talk–it’s around the ten times programmer and estimations. You know, kind of our cultural bias towards believing we can. Perpetuating some of these myths. But basically, the upshot of that is we don’t believe these estimates and that should be really a sobering kind of criticism of this ten time programmer. Because I pad things times two, times three, most people do, and as manager, I took other people estimates and did the same thing.

So now we’ve typically always kind of just by nature pad out against this order magnitude that people are supposed to have. That seems pretty obering, bad, but that does give us some kind of bound. Well in our problem, the worst case for us is we have to completely replace the dependency. That, theoretically, is a task we can understand that it’s a certain amount of time. So let’s run with that. Let’s just say that we have a linear set of linearly independent set of dependencies, which course is dubious, but again, simplicity and who knows.

So now, we have this nice series here that will sum all the dependencies and give us some kind of number. So we’ve got a metric. We’ve got a model. And I’m going to segueway now because where’s that D gonna come from? Is that even accurate? Who knows how you’re gonna find out? So, I’m gonna start to talk from the history of organization perspective. Kind of jumped forward to sort of one way in which early science went about solving a generational scale problem that in weird ways bears a lot of resemblance to this problem space.

Here is the Great Pyramid of Giza, built approximately 3rd millennium BS. Nobody really knows. The interesting thing here is that, that obviously took an enormous amount of time and effort. This is the vulture stone. I can’t pronounce that I’m not gonna try. And it basically has this paleolithic or Neolithic–an amazing piece of art, but is also depicting what used to be a controversial piece of art. Because it people thought, you know the crazy people were positive that it was like “Okay this is depicting when comets changed the weather. They slammed in at that time 10,000 BC.” They brought about this sort of period in history and that caused this climate change problems positive big shift in human behavior and is the origins of organization and anger right? Well nobody believed in until very recently some researchers at the University of Edinburgh. Klaus Schmidt basically used the model to back calculate where the stars would be–and not as rough. Their symbolic depiction of that is really literally down to 10950. They can back calculate exactly when the comets came. They were right and that really is amazing depiction that is the historical event that has a lot of accuracy for development like astronomy at a time. And also apparently they also managed to calculate long term changes in the Earth’s rotational axis from those comet impacts and they put them on these stone things from the early writing.

So astronomy and our kind of monument building is really old and our attempts to reason and put things down go back quite a long ways. And basically, every monument building culture were amazing astronomers. So this yeah basically all of them were really amazing astronomers and they had sophisticated methods to try to build these massive monuments. The city of workers here, from the Great Pyramid of Giza, unlike our previous assessment, which is that Charlton Heston and slaves built the pyramids. In fact, they’ve been able to determine that it was about 10,000 very skilled workers that lived right next to the pyramids and that in a very real way, they looked at this kind of like an Amish barn raising thing. So they had their entire civilization built around and kind of been geared up to build these great pyramids.

And so these monument builders. These are the moai on Easter Island. They were the monument builders that basically built themselves into existence. That is their kind of story right? They do their whole civilization and off to build these things and in doing so, they eradicated all the plants and the trees and they went extinct and but this wasn’t that long ago.

In one way, I argue that this is a pattern for how humans organize solving problems. And we’re doing it again on a much grander scale. This is the Foxconn factory in Shenzhen.

So what are what are some of the takeaways here? Well in a sense that I think can be meaningful for us and useful for us, a civilization is defined by these monuments. We build monuments. We need to be careful. Sussman pointed out about what monuments we build and how we build them. Because at the end of the day, they come to define us in a lot of ways, but on all levels of civilization.

Every single person in there kind of has cultural sort of attitude that brings them to work. And you know, these monuments get built and here we are today. So another thing that’s really fascinating is basically everything I showed you previously. No one really understands how they pulled this off. I mean let it sink in the Great Pyramid of Giza is an immense engineering marvel. The precision that we can’t really easily replicate unless we’re highly motivated, right?

No one has any idea because no written record of their building techniques are left. You know just like the papyrus. You know no one knows how this was done and in a lot of ways this inspires people, I think, to look to aliens. And I think that that is a huge mistake and I have a Facebook group chat called “Pure Idiots”. It is where friend of mine really loved the serious alien sort of thing. But I really argue with them and can’t go there because really, I think we should never underestimate the human nature of how badly we are inspired as nerds solving problems. And that I think can serve us in cases as inspiration and a warning, to be able to tackle problems that we have right?

This is BG from earlier. The earlier this–or yesterday and he said this great quote “at scale all problems are people problems”. Well, I think somebody and I were talking about it later. All of our problems are people problems, all the way down. And I think we have a historical basis and fragrant reference to understand that and that gives us a lot of grounding and a historical reason to take cultural criticism seriously and to do the work to change it. When we understand from a p principles way that it’s wrong and it’s building something better.

So let’s get away from the long past and talk about something more modern which is another type of generational problem. Which is a global closed system much like software. At this point, just you know, discounting aliens it pervasively impacts all aspects of human life and civilization and then this timescale causes us to sort of reinvent the wheel with certain patterns and I’m talking about climate change right? So I want to give you a quick overview, Just a quick history list of the history of weather and climate. What happened when. The highlights.

1824. One of my personal heroes, Joseph Fourrer, calculates basically the greenhouse effect. That the world would be basically far colder if it lacked an atmosphere. By 1849, the Smithsonian, with the invention of Telegraph have given out instruments all over the nation to get people to collect and telegraph them weather reports.

By 1870, we had a National Weather Service. Our government funded this whole effort because everybody really cared about that. Of course, the war happened and all the rest of this to motivate this a little bit more and but by 1834, right here in New Orleans, the first archive and tabulation unit for all this data was founded. And by 1957, it moved to Asheville and has been there ever since. So that gives us a really interesting time scale for how people have gone about solving this intractable problem.

So what’s really useful for us? Well I think that we need to really have an active recognition of the pervasive impact that this problem has in our life. And the way we do work. And the way we prioritize within an organization. And data collection was a necessity and a passion because of that again. Culture is built to build monuments. We build it to improve our lives and mostly to build monuments. Because, one thing that I didn’t show you, I forgot to mention from the from the vulture statue, is recent research has shown that we basically have–civilization starts, and urbanization started not because everybody thought that it would be a good economy of scale. That happened the other way around. So we started as monument builders which started urbanization. We did all these techniques, after the fact, in order to build better monuments.

So here we are, building a new monument. Data collection is important. Thomas Jefferson upper-left. He was an avid weather recorder. This is an example from 1907 which is an observation. A little bit more systematic and that is sort of other takeaway point.

Systematic collection. What they knew was relevant is what they did. And that’s true for the observations of Copernicus and Galileo, as well as for climate data. So I think this was a pep talk and that we have it easy.

We already have this huge corpus of data. It’s disorganized. It doesn’t really have provenance. But we do have a lot of data and we’re starting to talk all the time and a lot of talks touched on this on the conference that we are connecting these dots. We have classifiable types of problems that are pervasive, people have seen them before, but we’re reinventing them again. And we have the ability to, luckily this time, have a much more immediate turnaround time and the impact on a problem space we’re trying to understand. So that’s great. So I promise I’ll get back to my little model.

What can we do about this? So whatever solution that is effective that we come up with really will have to affect how I, as a person, contributor, part of society, dimension, and a organizational scale one. So we’ve got to find a way to “what do we do and how do we motivate people to do it?”. So we need to start with some simple taxonomy and it almost doesn’t matter what it is.

I’m arguing that it just can be as simple to be useful to create a true historical record that we can then automate, do machine learning against, or whatever kind of technique we want, to find that letter D and describe it better. So we can just systematically add a dependency annotation. It’s this check-in because of a dependency. That’s all we really have to do to start a real historical record. Certainly from the open source you know argument that will turn what is now a disorganized corpus that can’t be searched and we have no way to carve out how much of these problems that happen when or because of this particular issue that we’re talking about for that and a specific resilience problem.

Okay, so how do we motivate organizations to go along with all of this, assuming that people in general care enough about their sort of struggles with dependency and recognize the utility that these historical examples have, and that dependencies, as a class of problems, are worth pay attention to. And that’s not all we know, but that’s okay. We can start there or we should. So let’s test our model because we need a way to communicate all of these ideas to other people. and from that I’m kind of going with the Hume…Wait, I’m so bad at names. Bear with me. Kuhn, Kuhn, “The Structure of Scientific Revolutions.”

I think if we can evaluate this model from the point of view that it has got explanatory power and it is better than our current model, which our current model is being teenagers and not wanting to clean up, honestly. That’s the attitude and that’s how people treat problems like dependency management and other kinds of things like that.

Maybe we can re-characterize code depth. We can use this model in a real way given our assessment that we have to replace dependency. That can be one. How do we characterize code depth and how we don’t really, if somebody says, “It’s bad, like, bad.” We have to do something about it, but then someone further up would say, “Well, we can do that next quarter, maybe.”

If you could characterize how expensive it would be, then that would be a very compelling argument to look over time and say, “OK, are we actually driving ourselves out of business by ignoring this problem?” Like, “We’re going to have to pay this eventually, so how much is it actually going to cost?”

Which is typical for the types of organizational specializations that like CTOs evaluate risk management as a financial problem. We can give it a shot, because in some sense, we’ve got enough historical record where we can actually calculate those costs.

If building that library dependency into your project took somebody, they got paid X hours or M hours, well then, that’s an M that has a legitimate real-world value from that point of view.

Then of course, we have domain experts. Here you would ask anyone, “How long would that take you?” They’d say, “I don’t know,” but then they’d pad it by 10 and say, “Oh, a month.” Even if that’s not accurate, at least it helps the management and the organization to reason about this problem in a more principled way, I suppose, so I guess our model has utility, which is great.

I’m not saying, “This is a great model,” but we have to start somewhere. If nothing else, it should motivate people to think that we can take all of this and actually solve some of the corners of these problems based on like an old-school, early science way.

What else can we do? I’d argue that we need to build an archive, but I mean that in that historical way that includes real provenance.

We need companies and organizations and actually our entire civilization ultimately and idealistically to value this enough to say, “We should pay for all of everything. Hire people, train people, let people study this. More importantly let’s contract it all and do it well and with a lot of systematic consideration.”

One thing that would be really useful, because we need every citizen to participate, there’s a lot of corporate citizens with code that’s out of the way, enormous amounts of our code is mostly hidden from us. That’s the biggest criticism of the private versus public open-source ecosystem that we’re in now, because that data is going to help us solve our problems.

We’ve got to have a way for them to contribute it to us that is not frightening to them. Motivate them to say, “We can strip all of our intellectual property out of this and maybe anonymize names of functions and stuff, so that we can feel safe about contributing at least this much information.”

That’s not that hard and that’s a problem that we can solve and we know how to solve. Other people, like Zach Tellman, who had a great talk on and was arguing for a simple taxonomy and the power of names, well, I think they’re right. Somebody should really start to do that.

It doesn’t have to be complicated and we shouldn’t bicker about exactly what…Like, just calling it a “dependency,” is really good enough to get started until we know how to better refine our ideas about these things.

Finally, that code archaeology is probably a sub-discipline that should exist. We have an enormous amount of code. When I first started looking at Poisson distributions, I thought, “It’s data,” and started looking at what I could fit, none of it made any sense.

A lot of the check-ins and things, like some people don’t have public issue tracking, so you don’t even know which problems relate to our check-in. Some people have terrible documentation in their check-ins.

We don’t necessarily need that much refinement, although it would be very handy. If we at least could look at a code base and realize there is some problem related to our dependency in an automated way, so I could troll all of GitHub and start to put timelines and ascribe life cycles, that would be great.

It would also provide us with a more historically based and native based argument for how often problems happen and whether or not we should do something about them. That is the big takeaway. Thank you.

Eric: Thank you, George. Please, volunteers bring questions. I had some questions about the monuments. You had a few monuments that you were showing. What are some of the lessons we can learn from say the building of the Egyptian pyramids? Is this something where we have to become slaves to our software, our data?

George: I think the takeaway is that our tools and how we use them and in order to build these monuments is what we have to learn about and what we should reason carefully about how we’re doing it. It gives us a more compelling argument to liberate ourselves from pedantic tools.

Like, I hate the idea of daily stand-ups. I think there’s no real evidence that that’s actually an effective way to run a team. I personally go for organized chaos and that seems to work for my group of people. I think there’s sufficient evidence, actually, to back that. We don’t have to become slaves to our tools. Instead we should re-engineer our culture from the ground up at every level to do a better job at building monuments. We have done work to analyze the theme of what we actually want to have at the end of the day.

Eric: Someone asks about VMs and containers. You said that you didn’t use them. Can you elaborate on why?

George: To start with it was a pure organizational constraint. I was just not allowed to use them, so it wasn’t even a consideration.

Even when talking and thinking later about, “Let’s just replace all our data with code.” I think because even now container wars are ongoing…

Like, “Should we have VMs?” “No, containers are better.” “Which container?” “I don’t know, I hate Docker,” or even going, “Who knows what containerization technology is going to look like?” or, “How far is that VM model, when that goes way, or what happens when that breaks?” I think that just is a type of problem where you just kick the can down the road.

At least if the piece of code you’re putting down ostensibly is mainly the main points of breakage, are more directly related to the purpose of that code we have a better chance as someone who has to become a code archaeologist to get it to run again, the reasons off the systems to requirements.

Eric: I have another question. I’d love to have other questions, too, so please get them up here. I worked on a system, it was in C++ and I’d never done C++ before. I worked on it for two weeks to try to get it to compile on my machine.

I believe it was a bootstrapping problem where someone had configured their machine over years and years and it just worked on theirs, and so their environment was in sync with the software. Then, once you just copied everything in a certain directory on my machines it just didn’t work.

Of course, they hadn’t documented everything they did to the machine. Is there any way to recover that?

George: I think this speaks to the provenance problem. Maybe we need to stop using the word “documentation” and replace it with “provenance.”

In a way, if we can describe exactly where these changes came from, and why people are motivated to do them and have any level of systemizations, do it systematically on any level, then that becomes a corpus of useful data that harkens back to the turn of the century, those physicists…What’s that term when a person does more than one big problem?

Eric: Polymath?

George: Polymath, yeah, the more popular term, but polymath scientists you hear of. They addressed these problems, first of all, by making up definitions and then collecting data they knew would fit, so the best of their knowledge at the time that was relevant.

They had to do it systematically otherwise, it was useless. We’re drowning in our data now, because it’s really hard to make sense of it. In that sense, the problem is the problem is pervasive across, “How do we build software when it’s something where you’re not going to change our entire culture from the ground up to reassess it and make it that?”

Eric: Whenever I look at those tables, it always seems like, “Wow, they really knew what data to collect.” Why is it then when I think of logging stuff, I always make it way too complicated instead of building one or two things that would be useful?

George: Sure. I’m a “willful optimist,” I call myself. I would liken where we are, if we’re going to compare ourselves to big efforts to solve a huge amount of the problems in the past, we’re right at the beginning of what we’re doing. While we’re really, rightfully, proud of ourselves for how far and how fast and what amazing monuments we’ve built, the reality is we still barely know. Wisps of benefits.

We’re not that far along in this process. I think it’s OK to cut ourselves a break and go with the very primitive basic things we are starting to identify as problems. Dependencies are these weird class of problems that have all kinds of side effects that we would be better off not dealing with.

We don’t understand them at all. We understand them in isolation, our little corner of them but not in any systematic way.

In some sense, and I think this has something to do with how corporations have played this huge role in bringing computing to where it is today, and I’m not saying that with any critical overtones, but because that’s been driven, they were very focused on certain problems and they don’t have to share.

There isn’t this civic-minded attitude towards the monuments we’re building, which older cultures had. Even omniscience, people shared, Thomas Jefferson was recording data because that was a great thing to do, “Let’s contribute to science.”

Eric: This sounds a lot like Elana’s talk about Debina, this project that we will contribute to and they do all the work and maybe churn the build even if…

George: Totally. I’m glad you brought that up. I wanted to reference her talk too. We have these entirely rich datasets already. That is one great example of how we’re really far better off than our predecessors might have been in the early days of scientific problems, because we have an amazing amount of data.

It’s not organized. We don’t know how to make good judgments or reason about what’s in there, because we’ve been ad hoc about the whole thing mostly by understanding what it says.

Eric: You mentioned ad hoc, are there any ideas…? Is that a question? Are there any of ideas of like a doctor is required by law to have certain kinds of medical records, keep meticulous records about every patient that they see, is there hope for that in computing, like something like that?

George: I would be very careful about taking metaphors from how other disciplines have approached how to collect data and how to look at it and what records you really need to have, because their needs drove them to that place over time.

The organizations they built which had the precedents, totally human nature, reflect that. Some of them are bureaucratic needs and litigious needs. I think if we keep it simple and just confine ourselves to the simplest version of the thing that we understand, which at this point is just noticing this is a dependency problem, that we will eventually come to have the right balance of the types of things that we store.

Some of these AI questions have been brought up around how to codify ethics and rules and so forth and these become problems we need to have some reasoning about and then out of that will come actual record-keeping requirements.

Eric: I think I mentioned this in Will Byrd’s talk. It seems like our software only runs because we just never shut it down. We keep running it until we have like a class of people whose job it is to keep the machines running.

George: That’s an idea. We’re worse off in a weird way than the medieval scribes, because at least they only had to literally copy something. We’re told often, “I want it just like that, but throw in this and paint it pink and draw three perpendicular lines.” Like, “Wait, what?” From my favorite YouTube video, “Three perpendicular lines.”

Eric: I can’t read the handwriting on this, but it’s about the NASA moon landing? Yes, please.

Audience Member: The NASA moon-landing tapes, the majority of them, or certainly the highest quality ones were lost, because the format was not preserved and so the tapes themselves, physically are lost because nobody can read the format. Something analogous to a Docker container would have been able to preserve the ability to read those tapes. Wouldn’t that be a big improvement over not having them?

George: Don’t get me wrong. I have nothing against container technology. I, in some sense, wish I could have used it. I really want self-healing code that can fix itself for me and to put in place instead of these ad hoc primitive methods that seem to be the only sane thing that you can do facing all of this.

That would be great. I think we can do better than that. I think we can’t just assume that’s the solution to this problem, because history makes it pretty clear that that’s not how people approach these problems and that’s not how they end up solving them.

Eric: Isn’t a Docker container or a Docker image just another format you have to make sure that they’ll be able to be opened?

Audience Member: As with anything pedagogical, you build one on top of the other. You have an entire stack. We use that word every day. It works.

George: Sure. That’s a certain style of solution that has utility that we have an intuitive, emotional, like, “This is great, look at this monument. My stack’s 10 deep,” or whatever. I don’t think that that necessarily does us a service of trying to step back and look…

Audience Member: What’s the alternative? That’s my question. What’s the alternative?

George: I think the alternative is to continue forward with the things that work, but try to create a corpus of data that can directly address some of these questions, so that we can form a simple taxonomy. Like, dependencies model.

Other people, there are much better qualified and probably with better thoughts about this could probably build a beautiful, simple taxonomy that looks at our class of problems and can annotate them properly. If we all go ahead and do that and are systematic about it, we could have a chance to reason about why that works so well and when it falls down, how and why does it.

There was an argument in another talk, earlier, that was about, “We need systems language.” We probably do, but that’s an intuition that this is a class of problems that could be better reasoned about with a better tool.

Container technology and stacks have taught us a lot about how to cordon off these dependencies and these levels of abstractions. I think that’s incredibly useful. We couldn’t be doing what we’re doing today without some level of it.

Again, I think we need to be a little humble and to face that they don’t solve all our problems and we don’t understand enough and that we could come up with a better solution, but we need better data.

We need to take that other intuition that they don’t always work, more seriously, in a systematic, cultural way and actually get people who are motivated to deconstruct why and how and compare them to other classes of problems. I think when you do that and we don’t look at each problem like it’s in isolation, we can start comparing them to similar classes of problems.

Do containers have the same problems as VMs? That’s an interesting question I don’t think anybody’s…I’m not an expert in this, but I think that’s an interesting question. How could we have thought about that? We just came up with this stuff.

We don’t have all these answers, and that’s OK, because I think we’re starting to figure it out, but we need to compare notes and we need to do that systematically. That’s how we’ll come up with solutions.

Eric: Any other questions? Yes?

Audience Member: In terms of just day-to-day first steps about trying to go about conscientiously writing your program and getting to archive and annotate, what’s your advice on that?

How would that look from assuming that what you are writing will break in six months and reasoning about predicting why.

George: Predicting why it does is the holy grail of what we’re doing with the theme collection.

I think the historical record and being conscientious about the providence of these types of problems you personally are encountering with the code base goes a long way to let you build tools to help you reason about, “What happened and why it didn’t happen last year?” or, “Is it really this type of library that’s to blame?”

“Maybe it’s the way we built our API,” but from that you have to do simple things. Like if you really have constant dependency problems, you should, every check-in that’s related to it do the simple thing and put “dependency” on it.

Whatever it is, you should be systematic about it and treat it with a certain level of enthusiasm that eventually this will actually help you search through your code base and your history and be able to start to better reason about the problem space that you encountered.

Eric: There’s an idea that the changelog, as a standardized point, like practices about adding stuff every time you do a release. Are you thinking something like that?

George Keirstein: No, I’m being very literal, systematically add some dependency added to all the relevant check-ins. We add a check-in box. If it’s related at all conceptually to you from your problem space, it should have a dependency in there, in that comment.

That’s the simplest thing you can do. If that point has a time scan, we have a lot more information about what changed when, and finally, we know figuratively how, without trying to look at every check-in, we’ll remember what we did, that it’s due to some check-in like a dependency platform.

I think other classes of problems, given a simple taxonomy — you could start your own — can help you carve up these problem domains. From there, you can start to reason about whether or not, like this part is what you suspected was going wrong and suspected was influencing this actually is happening.

Because right now we just have our intuitions and we roll with that, based on experience. But the truth is, those are often wrong.

Eric: I have another card, from Hunter. “How do literate programming approaches come in?” I love the idea of literate programming, but it’s been a really long time since I looked at it. I don’t know. I haven’t really thought about that, but yeah, that is speaking to we need to rethink what we mean by documentation. I think Knuth was right about that.

Again, here we are, years later, and I didn’t bring his books anyway. Didn’t have that right background and hack the hacker without it. That sounds great, let’s do that. But we do need to rethink documentation and we need to rethink how we look at what we want to report and how we want to do it.

Sure, maybe we can rethink how we program altogether. Because again, the tools that we use to fill the monuments are symbiotic and in line.

Male Audience Member 1: But in that sense, literate programming, taken as a practice — I’ve never been super-successful at it — is not just documentation. It’s how we approach the actual construction of the software and the organization of the software.

George: Yeah, that’s great. Again, I don’t really know anything about it, but at the end of the day, that’s more evidence the code archaeology to some discipline should be treated like a legitimate thing and maybe people should start doing that.

We can get value from it and justify the changes that people advocate and the solutions to certain parts of the problem domain that people are really excited about.

Eric: Yes.

Male Audience Member 2: All that historical stuff…People were able to date with the stars. They saw these events and so now we can guesstimate better when things happened. Is there any kind of log we could have, in computering, that would provide, “OK, I can guess that the dependency’s happening between this and this”? Or “These things should work together”?

George: That’s a great question. I don’t know. I think that’s what we’re looking for when — or historians try to analyze our own computing history and people try to put together talks about how to systematically think about systems versus processes versus these other things.

Yeah, it would be wonderful to take something that all the dependency management stuff that Elana is working on and tie it into an actual narrative that it’s got descriptive powers, say, “Oh yeah. This class of libraries suck. We should really stop trying to do it.”

Eric: OK. I think we’re going to end it there. Thank you so much.

The post Code Hibernation and Survivability – George Kierstein appeared first on Clojure SYNC.

Permalink

Tracking changes to a Reagent atom

I was recently having some difficulty debugging a problem in a ClojureScript single page application. The SPA was implemented using reagent1.

This interface stores most of its state in a global reagent.core/atom called db. To debug the problem, I thought it would be useful to track how the global state changed as I interacted with the interface. How do we do that?

For the rest of this article, pretend that (require '[reagent.core :as reagent]) has been executed.

First, let’s define db-history in the same namespace as the global reagent/atom, db. This is where we’ll collect the changes to db.

1
2
3
4
5
6
(ns ui.data
  (:require [reagent.core :as reagent]))

(defonce db (reagent/atom {:app/current-page :offer-list}))

(defonce db-history (atom []))

Next, let’s write a function called aggregate-state. This function grabs the current value in db and conjs it onto db-history. It also limits the history to the most recent 101 states.

1
2
3
4
5
6
(defn aggregate-state []
  (let [d @db]
    (swap! db-history (fn [hist]
                        (-> (take 100 hist)
                            vec
                            (conj d))))))

Now we need to invoke aggregate-state whenever db changes. We can do this using reagent/track. reagent/track takes a function and optional arguments and invokes that function whenever a reagent/atom that function depends on changes.

reagent/track! is similar except it immediately invokes the function instead of waiting for the first change. We can use it to cause aggregate-state to get called whenever db changes.

1
(defonce db-history-logger (reagent/track! aggregate-state))

Now history of the global state is being tracked. But we need a way to access it. Below is what I ended up writing. When you call ui.data.history() in Chrome’s JavaScript console, it returns an object you can click on to explore. If you pass in strings as arguments to history then it only selects some of the data from the global db and history.

1
2
3
4
5
6
7
(defn ^:export history [& args]
  (let [d @db
        k (if (seq args)
            (map keyword args)
            (keys d))]
    (clj->js {:history (mapv (fn [x] (select-keys x k)) @db-history)
              :current (select-keys d k)})))

It only took about fifteen lines of code to gain a view of our application’s state changes over time. This view helped me solve my problem. Hopefully it will help you too.


  1. This particular project is nearly four years old and has had many hands on it over the years. Working in it reminds me of how useful re-frame is on larger applications like this one.

Permalink

Two Years of Lambda Island, A Healthy Pace and Things to Come

It’s been just over two years since Lambda Island first launched, and just like last year I’d like to give you all an update about what’s been happening, where we are, and where things are going.

To recap: the first year was rough. I’d been self-employed for nearly a decade, but I’d always done stable contracting work, which provided a steady stream of income, and made it easy for me to unplug at the end of the day.

Lambda Island was, as the Dutch expression goes, “a different pair of sleeves”. I really underestimated what switching to a one-man product business in a niche market would mean, and within months I was struggling with symptoms of burnout, so most of year one was characterised by trying to keep things going and stay afloat financially, while looking after myself and trying to get back to a good place, physically and mentally.

Luckily that all worked out, and during this second year I’ve managed to find a steady, sustainable pace. I’ve been recharging, I even managed to take some holidays this time, and I’m slowly rebuilding my previously depleted savings.

I’m still not able to live off Lambda Island completely, but it provides a good bit of income. One lesson I’ve learned first hand is that the greatest product is worth nothing without good marketing, and of all the hats I wear the one that says “Marketer” is perhaps the one that suits me least.

Still, I have gained some recognition for my efforts in Clojure’s community and in Open Source development, as well as for creating content of exceptional quality, and so there continues to be a steady trickle of new signups. This year has also seen more teams signing up for company plans, which I think is a great development.

It makes a lot of sense, Lambda Island has always focused on the kind of things you’d need to know and use in an actual job, rather than on what’s the latest hotness, and so teams are finding it a great way to quickly introduce people to Datomic, teach people foundational Clojure concepts, or improve their approach to testing.

Coaching and Training

Hiring and onboarding Clojure developers isn’t always easy. There is only a limited amount of senior Clojurists in any given locality, and so companies have to train and mentor more junior profiles, as well as experienced devs coming from other languages. Lambda Island can be an excellent resource for this.

I’ve been helping one company with this process directly. Nextjournal is building an ambitious product using Clojure and ClojureScript, but most of the devs are coming from Erlang and Elixir.

Over the past year I’ve helped them figure out issues with their tooling to make sure everyone can work comfortably. Through one-on-one coaching sessions and code reviews I’ve helped people to grasp Clojure’s guiding principles, adopt a REPL-based workflow, get better at writing idiomatic code, and generally get over the uncertainty of “are we doing this right?”.

From talking to people at conferences and meetups it seems there are more companies that could benefit from this kind of personal coaching, and so I’ve been talking with some talented Clojure people to see if we could start offering this as a general service. I’m very excited about this possibility! If you think your company could benefit from any kind of training or coaching by experienced Clojurists, or if you want to be the first to know when this service becomes available, then please drop me a line (arne at lambdaisland dot com). We’re still figuring out the specific, and any concrete input at this point would be extremely valuable.

Lambda Island

Lambda Island will continue to regularly publish new content, the next episode will be a follow-up episode to Episode 38. Transducers. While that one taught you how to use the built-in transducers, the new episode will dig deeper into how transducers work, and look at some powerful transducer librarier like xforms and kixi.stats.

Year two has seen fewer new episodes than year one, only about a dozen versus the 30 episodes I crancked out in those first twelve months. This has turned out to be a more sustainable pace, and while I would like to increase the frequency a bit again it will remain closer to once a month than to once a week.

You may have noticed that the title screens have started looking a lot nicer though, that’s because this work is now done by the talented Lubov Soltan, who also created the branding for the Dutch Clojure Days. This has been a big step for me as it’s the first time another person has been involved in producing episodes.

Buying episodes

Something else I’ve been working on is making it possible to buy individual episodes. So far the only way to access premium content has been through a subscription. This made sense at the beginning, as there wasn’t much content yet, and new videos were coming out regularly. Subscriptions were offered cheap, and what you were really buying was the promise of future updates.

Now that there’s a substantial catalog new subscribers get instant access to 40+ episodes, about 9 hours of content. Subscription prices have gone up a bit, but not nearly as much as they should considering the value you’re getting. On the other hand consumers have become more price conscious about recurring charges for online services.

So the plan is to sell individual episodes instead, they’ll likely be priced around the ten dollar mark. There will still be subscriptions, but they will be marketed more as premium “all access passes” aimed at companies.

This will also make it more rewarding for me to create new content, and to do better marketing, as I can more directly correlate my efforts to sales. I’ll also be able to see more clearly which topics work well and which don’t. In the end there’s no better way to figure out what people want than by letting them vote with their wallet.

Of course all existing customers will be “grandfathered in”, you can keep your existing plan as long as you like, even those that signed up for ridiculously cheap yearly plans all the way in the beginning. You’ve supported me from the start, this is the least I can do to say “thank you”!

Privacy

Finally I’ve been shipping various improvements and fixes for long standing bugs to the site. Part of this has been to make the site GDPR compliant. Email notifications are now strictly opt-in, and there’s a Privacy Policy that complements the existing Cookie Policy.

I’ve also removed all third party JS and other assets, with the exception of the Vimeo player, which unfortunately also injects some analytics tracking of its own. We might add back some server-side analytics in the future if it makes sense for marketing purposes, but as it stood the analytics were rarely looked at, so no need to let BigCorp track you because of it.

Community and Open Source

It has been part of Lambda Island’s mission to support the Clojure community and ecosystem, and plenty has happened on that front this year.

The big one has been that we relaunched ClojureVerse. This forum had been up and running for several years, but few people actually knew about it. We figured there was a need for an alternative online space, one that’s less formal than the Clojure mailing list, and less noisy than the Clojurians slack. A warm and welcoming place for thoughtful discourse, and for sharing what you’re learning and working on.

On the 2018 Clojure Community Survey over 17% of respondents mentioned using ClojureVerse, which considering it was only a few months since the relaunch is a really nice result.

We’ve also replaced the Clojurians slack log with a proper Clojure app (github and announcement). The old kludge of Python and Node scripts had become a nightmare to maintain. By moving it to a Clojure web app it has become a lot easier for people to submit contributions. Message parsing and rendering has much improved, and we’re properly showing threaded messages, a Slack feature that didn’t exist when the old site launched.

There is still work to be done. So far we’ve focused on getting the thing run smoothly and render things properly, and to make sure the site stays accessible when half a dozen indexing bots are crawling it at once. It’s all taken a bit of time, but this is a long term effort, and eventually we’re getting there.

The main thing that’s still missing is to automate the import of new logs into the database. This is currently still a manual process, which means that the site is often quite a bit behind. This too will get sorted out in time though.

Lambda Island picks up the tab for the hosting and the domains for both ClojureVerse and the Clojurians log, and does the bulk of the system administration and development for these projects.

Open Source

In terms of open source it’s also been a good year for Lambda Island. We’ve released new versions of lambdaisland/uri, Chestnut, lambdaisland/ansi; contributed to the Emacs world with Chemacs and parseclj; submitted patches to clojure.tools.cli, integrant, rewrite-clj, matcher-combinators, cider-nrepl, lein-figwheel, sparkledriver, and toucan. These are often small patches, but integrated over time they have likely made many people’s lives just a little easier.

I’m currently working on a project that could have a real impact on how people structure and run their test code, but since that’s not quite ready for prime-time yet I’ll save the details for a next installment.

Conclusion

Looking in from the outside it may seem like not much has been happening here this past year, but nothing could be further from the truth. Several things have been brewing behind the scenes, and you’ll get to taste the fruit of that labor before long.

The biggest development is that Lambda Island is no longer a one-man venture, I’ve started collaborating more with others, both formally and informally, and you’ll be seeing much more of that in times to come, which is why I felt I could already dish out a royal “we” a few times in this post.

If you want to stay up to date about future developments then please sign up, and opt-in to receive our newsletter.

Permalink

RIP grandma; meo progress & beta version

A couple of weeks ago I introduced meo, the intelligent journal that my beloved grandma inspired around two years ago. Since that blog post, she passed away, following a stroke and subsequent coma. Those weeks have been tough, and I miss my grandma a lot. It helped me quite a bit though to be working on meo, something that she inspired and that will be part of her legacy. There were multiple occasions recently when I might have given up working on meo otherwise, throw in the towel, and look for another hobby. Yes, I still like Clojure, but this code base that I created has made me feel like an idiot way too often recently.

But instead of complaining, let me just give you an update on where I made progress, and what I was struggling with. While grandma was still in the hospital, I played around with the Mapbox API and a map view that allows zooming into areas with recorded photos, and then see which photos were taken there. This is how that looks like, for the photos I took at EuroClojure 2016, and their respective whereabouts:

That’s working pretty well, but I can’t include that in a free version of meo as then I would be paying for your usage of the mapbox API, and I will not do that. Not sure yet what to do with this feature. Maybe something similar can be created with OpenStreetMap or Google maps? Or you can create your own Mapbox token. I’m open for ideas, and pull requests are welcome.

Then I took issue with me calling meo an intelligent journal and realizing that it’s not particularly intelligent thus far, so it clearly needed the integration of a neural network, right? So I learned some Tensorflow and created a simple feedforward network for predicting which story an entry might belong to. I wanted to power the story select field by a top ten of the results from asking the network for a prediction. That works well enough, with an accuracy of over 90% for the matching story being in the top ten predicted stories. That task was fun, but at the same time, it was bullshit at this point. Rather, it was that kind of a shiny object that an intelligent journal should really protect me from pursuing by making me stick to my plans - and hold myself responsible for the things I have actually committed to. I’ll get back to neural networks within meo at some point, but I’m also hoping to find collaborators who are interested in some machine learning inside a data-driven journal and want to help make this useful.

Then I had a really annoying issue with the ClojureScript client inside then Electron renderer process somehow disconnecting from the Clojure backend, or rather all processing getting stuck, and it would only work again after completely closing the electron application, and reopening it, as a simple refresh in the developer tools in Electron would not help. I think I eventually found the problem by using the YourKit profiler, which showed me a deadlock related to logging.

I am not entirely sure what happens, but I know that when I use timbre with the default configuration merged with mine, and then when multiple threads tried to log at the same time, they seem to compete over stdout, blocking the entire application. Not sure if that is to blame on timbre, or because of log4j also being in there from other libs, but for me, it was definitely a big WTF moment, and now everything is working again, but without logging to the terminal, which is weird. Please let me know if this sounds at all familiar to you, and what you did about it. I created an issue for it.

Then I was trying some simple refactoring and noticed that it had become very unwieldy to work with the project, as persistence & retrieval related code was sprinkled all over the codebase - and totally ad-hoc. I had wanted something like GraphQL in mind for a long time, but when I looked at Clojure implementations early last year, I found Lacinia and did not understand how to use it. Now I looked at it again earlier this month and decided to finally make the switch. Overall it’s really nice to have a language for describing how returned data looks like, and then fetch exactly what you need - no more and no less. Before it was really tripping me up what to fetch when, and then mostly underfetching a little, and sometimes overfetching so much that it would slow down the entire application.

So I refactored the code base to use Lacinia for all data retrieval in meo. Mutations may come later at some point, or maybe not. Using Lacinia started very smoothly as long as I was interacting with my development instance. But then with my actual dataset of 91K entries and 820K words, it started getting pretty slow, and I initially found it difficult to figure out why.

One of the unexpected things was that keys have to be in snake case. So I did a transform-keys from the camel-snake-kebab library, and that turned out to be a pretty dumb idea, as for some cases it took over 600ms to transform the initial data structure to give it to Lacinia as required. But there was no other way, except for migrating my entire append log (roughly 170K lines, where each line is a new version of an entry). That’s what I ended up doing since it’s way better to do this once as opposed to on every request. The migration worked fine, but it’s also weird to not have the same case everywhere. So if you wonder why data is using snake case, it’s because the GraphQL spec requires it, and by proxy Lacinia as well.

Then, I wanted GraphQL queries to execute in parallel and did not understand how to do that for a surprisingly long time until I finally figured out a way that works. I initially thought I could call execute with a few queries in parallel, say inside say a few futures, and then have those run independently. Unexpectedly, though, they did not run independently, but rather sequentially - which I found odd, because what does that thing actually synchronize on at all? I learned that I need to implement async resolvers, plus assign a thread pool. But now that is figured out, it’s running smoothly. I just feel like the code around data retrieval is still way too complicated, and I’m looking for collaborators who want to help me clean it up.

Oh and then I tried reviving the packaging for Windows 10. If you suspect a rant now, you’re wrong, to my own surprise. After installing cygwin, it has been running very smoothly, and I have published tens of versions of meo into an S3 bucket without a glitch. On Linux, however, setting up a virtual machine for publishing AppImage files was way more of a nightmare, with Electron relying on global libraries, and new versions of it that weren’t available in Ubuntu, and so on. Eventually, it all worked out though, and here are the installers:

All of these provide auto-update functionality, which can be accessed through “Check for Updates” in the application menu. In addition, checks for a newer version run once every 24 hours.

About using meo to document the process: it’s quite charming to have contemporary witness reports for all those things that were bugging me. That’s because I document the entire process of whatever I am working on, including screenshots. That makes it super nice to look up all this stuff, rather than having to rely on memory. And then, yeah, keep me accountable. The stuff that I am working on is my life, for the number of hours that I spend, plus polluting other areas of my life when I cannot leave grief where it belongs. Doing a brain dump into a journal entry isn’t so bad for that, and then telling yourself that you can stop worrying stuff, as all can be picked from writing next time.

But even better have a process in place to look at the amount of frustration in your life, try everything you can for actively changing the situation from the inside. And if that does not work, know when to quit. I have had that way too many times in my life that a fucked up situation had just normalized, and become my reality, instead of the necessary change. An intelligent journal should really help and support you in a situation like that. Meo isn’t doing that yet, at least not to the extend that it could, but that is where I want to go with it.

Please try out the appropriate links if you like, and let me know what you think. The entirety of the functionality will certainly not be obvious, and I haven’t gotten around to writing a manual yet. But let me know where questions arise, and ideally create issues on GitHub for those. I will try to answer everything that comes up. Thanks & until next time.

Permalink

Copyright © 2009, Planet Clojure. No rights reserved.
Planet Clojure is maintained by Baishamapayan Ghose.
Clojure and the Clojure logo are Copyright © 2008-2009, Rich Hickey.
Theme by Brajeshwar.