Archives for April 2017

YeSQL? How a Spanner changes the toolbox

April 24, 2017 Leave a Comment

The world of data continues to evolve. The last explosion was with the push on the NoSQL bandwagon. In the rush to solve the problem of not being able to scale that Oracle DB, many ended up giving up a lot along the way. Maybe, the problem wasn’t SQL after all, it was more dealing with the growth in data. It was one thing to have client server apps that needed data, but once we got to the infamous “web scale” we started to run into problems.

The database crew tried hard to keep us tied to the database. Stored procedures resulted in the SQL DSL becoming more and more complex and divergent (I was on the Sybase side which was much superior to Oracle!! 🙂

Then desperation kicked in and the JVM was put into the database itself to be close to the data.

MySQL and memcached got us a long way, and then there was a split. A lot of the scaling issues were revolving around the desire to normalize your data, and the JOIN hell couldn’t perform. One solution was to denormalize and treat your MySQL bigtable-style and give up on some of the relational concepts in the DB, handling that in the app layer (the opposite direction to stored procs).

Other’s jumped in on “NoSQL”, leaving behind the SQL and relational world. The path to scaling was eventual consistency. Different data models. An explosion of new approaches came into being, which has been exciting even if it has come with some pain (choice, immaturity).

We can unbundle the various trade offs. With Spanner, for example, you can benefit from the ecosystem and tools around SQL, but handle scaling and cheating around CAP theorem. Being able to make the choice to not give up SQL, but get consistency benefits, changes the fundamentals involved in making a choice.

Now, SQL can certainly be frustrating. We have lived through the ORM years, and how much code have you written that marshals between strings and the relational model as we map to an OO model. If you believe in type systems, it can feel like clawing your eyes out when you see code that converts between the boundaries. Whenever you bridge between worlds like this (e.g. JavaScript to native land in React Native) you are holding your nose and valuing the benefits that you get on the other side.

SQLite did such a good job as an embedded database that we ended up getting it on all platforms, including mobile. This is a lil confusing at times. Many of the benefits with SQL are on the backed (OLAP, OLTP, etc) and having to go through the mapping often feels like real overkill. Wouldn’t it be nicer to take your application state, and users data, and just …. save it? Thus we got solutions that allowed you to do just that, and then tackle other problems such as the notion of multiple clients and real-time (e.g. the original Firebase real time database!). For many applications being able to use a tool such as Firebase is a great solution as it is easy to reason with for multiple clients, and you get a rich way to query for more complex use cases.

But, what about other use cases? I was working on a knowledge base system that allowed you to store your own sets of knowledge, and also share this data with other users. This predated noms (from the amazing Aaron Boodman and team) that gives you “git for data” feel. Our solution was more of a shared graph, so it naturally fit with a graph database. At first we used Neo4j, and then GraphQL came along which gives you an interface on top of various stores.

We needed the following:

The user will have data that needs to be sync’d across all of their devices and surfaces
The data can be connected to multiple users
Users could shared edit at times
Users could also fork data (and merge) to allow for differing content without siloing forever
Have as much data as possible on device so it works offline.

As soon as the system is big you get into obvious conditions. Most of the time “your own” data could always be available on device. But the shared knowledge graph was often too large. We had to come up with good strategies to sync a subset of the overall graph. When running searches we needed to run multiples queries: on locally to get you results asap, and another to the backend to gather others that were out of our local scope.

This resulted in a bunch of infrastructure work, and I am still on the lookout for better solutions (I would love to hear your thoughts!). I really enjoyed using a graph database in this context as it truly fit.

As we build offline first, there is still room for solutions that truly nail the experience for client developers, and still give you the data you need on the backend. I am excited to see the energy in the database market these days and how it nicely maps into the reactive world where data changes cause UI updates. The data change can stream in from a user interaction, or from the backend, and it doesn’t matter!

NoSQL is still evolving. Graph databases are a new hotness. And we keep seeing new entrants but I am also really excited to see SQL itself getting a shot in the arm with Spanner. I sit back and imagine SQL on the client and server with a sync solution in between and wonder…….

Some people are TypeScripting

April 12, 2017 Leave a Comment

Programmers have debated type systems since the dawn of time, and maybe always will. When communicating with the machine, what level of information should we have to put into instruction for our digital friends? Do we have to be so explicit? Is it worth it? Can they just work it out for us? Can then even work it out better than we can?

"After having used TypeScript for nearly a year, I have to confess: I never want to start a new project without it again." – @tomdale https://t.co/fzXt7BfztP

— TypeScript (@typescript) April 12, 2017

This has cropped up on the JavaScript scene again with the rise of TypeScript. We keep hearing from engineers discussing what and why they are doing here, such as Slack’s usage in their desktop app, and Sir Tom Dale on how TypeScript is used in Glimmer (and how it is used differently to Angular 2 tools for example).

One of the key decision choices of TypeScript makes it a fantastic gateway drug. The first step to usage is mv foo.js foo.ts. Not only is this trivial, but it makes the process non-daunting, and you often get value as the compiler finds subtle bugs that you didn’t know you had.

When you get this value, you start to think things such as:

“An autocomplete system that only uses words in the current document feels barbaric afterward” — Felix Rieseberg

increasingly dislike the term/philosophy of “trade-offs”. don’t pick between things, find the right balance. pic.twitter.com/JT0gK5nfDq

— Luke Wroblewski (@LukeW) March 9, 2017

This is the heart of the trade off (sorry, balance) with TypeScript. When building large JavaScript systems many made the baby step to use doc comments to add @type info and more structure. This always seemed a sad hack. Rather than a first class solution we were resorting to hiding some extra goodness away from major parts of the system. Yes, we built special tools that would grok the comments, but man. It really didn’t feel right. If we feel the need to do types, can’t we put them more front and center?

In fact, is this the right balance, or are we failing to take a further step to a better place?

I honestly don't understand the draw to #TypeScript. Once you decide you no longer want to write JS, why stop at TS? 1/

— Phred (@fearphage) April 12, 2017

We are once again hiding valuable information from key parts of the system. The type erasure means that our runtime isn’t able to get the benefit of the information that we are conveying. This information could be gold when it comes to AOT compilation, especially important in the world of mobile. Will we keep pushing on from this next level of middle ground to take the next step with types? So many were burned by thinking that the experience of types == that hellish enterprise Java 1.1 codebase, but the grey beards have always been shaking their heads at us there. We have type inference and much sounder type systems abound now.

Lin’s diagram comparing phases of JS and WASM execution

And then we have Web Assembly. Lin Clark did her usual magic talking though the performance implications. We are seeing increasing experiments with wasm as a compile target that allows the system to boot up predictably fast. Get going, and then improve from there. This can be huge, especially in a world where the time from user intent (a tap) to a running system converts to engagement and gold.

It is also interesting to compare the rise in TypeScript with the rise in Kotlin. In my mind, the window opened for Kotlin at a time where the type of source code you wrote for Android didn’t have some of the cool functionality that came to Java later. Similar to TypeScript though, it offered a really smooth learning curve. You can look at any Kotlin code and grok it really quickly, and you can start using it in your projects incrementally. That factor is so vital for adoption for these languages. You don’t have to wait for a new project, you can give it a try in many other ways. Once you dive deeper though, as with TypeScript, there are more advanced features, and you run into practical choices around making the system grok Kotlin even more, and creating extensions to access support library compat methods, or dealing with platform nullable types.

The allure of the incremental step is real. Is it an ideal end state, or will the masses go to a place where we push further? The good news is that there isn’t a shortage of experimentation, and the core platforms are getting better and better!

“We are programmed to receive. You can check-out any time you like, But you can never leave!”

Sustaining Open Source and Ecosystems

April 3, 2017 Leave a Comment

The topic on how to value open source, and how to make sure that it can continue to thrive and work can be rewarded, is an old one. It has been heating up again recently with waves of tweetconvos, the latest being:

it's definitely not ideal. I'm just trying to explain why this ideal model of open source projects getting money from big companies doesn't

— vjeux ✪ (@Vjeux) April 1, 2017

It seems obvious that foundational work for companies that make a ton of money from it should be rewarded. However, rewards come in many flavors, and individuals create open source works for varied reasons.

Let’s bypass the topics of “you get other rewards…” (reputation, skill, community, job possibilities, and even rare golden tickets): what role should large companies play?

Large companies should invest in projects that they get value in. There are many ways to do this and the issue of control often comes up. Are you giving money to a project, or looking to hire a particular commiter and steer their direction? When does it make sense to hire vs. pay a contract? There are trade offs for all parties when it comes to bringing someone on as an FTE: e.g. benefits and growth with the company vs the bureaucracy of performance reviews 🙂 Do you sponsor issues or interest or stay at arms reach?

If you look at the majority of large companies you will see all types of arrangements in place that depend on many particulars.

I want to focus in on a type or sponsorship that ties into the question around funding for the Babel library. I think large companies have a responsibility to help an open ecosystem, such as the Web, thrive. At Google or Facebook, we make a great living from the Web and much usage touches both companies.

It is definitely our turn to help out. The government kicked off the incubation. Vint Cerf and friends had what they needed in resource and constraints to make the Internet, open for all to build on top of. This seeded an ecosystem that allowed for many platforms and business models, all leading to today.

Now we have commercial success we need to step in to support the ecosystem. The Web still suffers from an obesity epidemic, and anyone who is helping deserves much credit. As folk invest though, we need to be very careful to do so without king making and accidentally blocking future innovation. It is easy to do damage here, but that isn’t an excuse to sit back and do nothing.

I am excited to see the conversation continue and to work together on how best to garden the special platforms that we now have. Open Collective even has a conference on the topic in June.

Whenever I think about the topic I have to admit that I end up pulling the strings that end with frustration around how we value work in the world at large, and how we do not reward for long term value (see: how we pay teachers).