Youry's Blog

Youry's Blog

How To Get Hired — What CS Students Need to Know & A Future for Computing Education Research

leave a comment »

  1. A Future for Computing Education Research 2015-01-20 21.38.45ulltext

  2. from

I’ve hired dozens of C/C++ programmers (mostly at the entry level). To do that, I had to interview hundreds of candidates. Many of them were woefully poorly prepared for the interview. This page is my attempt to help budding software engineers get and pass programming interviews.


What Interviewers are Tired Of

A surprisingly large fraction of applicants, even those with masters’ degrees and PhDs in computer science, fail during interviews when asked to carry out basic programming tasks. For example, I’ve personally interviewed graduates who can’t answer “Write a loop that counts from 1 to 10” or “What’s the number after F in hexadecimal?” Less trivially, I’ve interviewed many candidates who can’t use recursion to solve a real problem. These are basic skills; anyone who lacks them probably hasn’t done much programming.

Speaking on behalf of software engineers who have to interview prospective new hires, I can safely say that we’re tired of talking to candidates who can’t program their way out of a paper bag. If you can successfully write a loop that goes from 1 to 10 in every language on your resume, can do simple arithmetic without a calculator, and can use recursion to solve a real problem, you’re already ahead of the pack!

What Interviewers Look For

As Joel Spolsky wrote in his excellent essay The Guerrilla Guide to Interviewing:

1. Employers look for people who are Smart and Get Things Done

How can employers tell who gets things done? They go on your past record. Hence:

2. Employers look for people who Have Already Done Things

Or, as I was told once when I failed an interview:

3. You can’t get a job doing something you haven’t done before

(I was interviewing at HAL Computers for a hardware design job, and they asked me to implement a four-bit adder. I’d designed lots of things using discrete logic, but I’d always let the CPU do the math, so I didn’t know offhand. Then they asked me how to simulate a digital circuit quickly. I’d been using Verilog, so I talked about event simulation. The interviewer reminded me about RTL simulation, and then gently said the line I quoted above. I’ll never forget it.)

Finally, you may even find that

4. Employers Google your name to see what you’ve said and done

What This Means For You

What the above boil down to is: if you want to get a job programming, you have to do some real programming on your own first, and you have to get a public reputation, however minor, as a programmer. Don’t wait for your school to teach you how to design and program; they might never get around to it. College courses in programming are fine, probably even necessary, but most programming courses don’t provide the kind of experience that real programming gives, and real employers look for real programming experience.

Malcolm Gladwell wrote in Outliers,

… Ten thousand hours of practice is required to achieve the level of mastery associated with being a world-class expert — in anything.

Seems about right to me. I don’t know how many hours it takes to achieve the level of mastery required to program well enough to do a good job, but I bet it’s something like 500. (I think I had been programming or doing digital logic in one way or another for about 500 hours before my Dad started giving me little programming jobs in high school. During my five years of college, I racked up something like several hundred hours programming for fun, several hundred more in CS/EE lab courses, and about two thousand on paid summer jobs.)

But How Can I Get Experience Without a Job?

If you’re in college, and your school offers programming lab courses where you work on something seriously difficult for an entire term, take those courses. Good examples of this kind of course are

Take several of this kind of course if you can; each big project you design and implement will be good experience.

Whether or not you’re in college, nothing is stopping you from contributing to an existing Open Source project. One good way to start is to add unit or regression tests; nearly all projects need them, but few projects have a good set of them, so your efforts will be greatly appreciated.

I suggest starting by adding a conformance test to the Wine project. That’s great because it gives you exposure to programming both in Linux and in Windows. Also, it’s something that can be done without a huge investment of time; roughly 40 work hours should be enough for you to come up to speed, write a simple test, post it, address the feedback from the Wine developers, and repeat the last two steps until your code is accepted.

One nice benefit of getting code into an Open Source project is that when prospective employers Google you, they’ll see your code, and they’ll see that it is in use by thousands of people, which is always good for your reputation.

Quick Reality Check

If you want a quick reality check as to whether you can Get Things Done, I recommend the practice rooms at If you can complete one of their tasks in C++ or Java within an hour, and your solution actually passes all the system tests, you can definitely program your way out of paper bag!

Here’s another good quick reality check, one closer to my heart.

Good luck!

Please let me know whether this essay was helpful. You can email me at dank at

Shameless Plug

I’m looking for a few good interns. If you live in Los Angeles, and you are looking for a C/C++ internship, please have a look at my internship page.


Written by youryblog

January 17, 2015 at 6:40 PM

Why can’t you find a job with a Stanford computer science PhD?

leave a comment »


To many of my older colleagues, the idea that you possibly couldn’t find a job with a good degree, let alone a PhD, is unthinkable. And what about a promising young graduate in Computer Science from Stanford University? What if he has a PhD? He may not be able to secure an academic job, but industry recruiters will be all over him (or her). Surely!

The truth is maybe harsher.

Chand John wrote a touching article recounting his personal experience. No doubt, he expected to easily land a good industry job. At least, that is what his professors expected. Yet it took him a year to get a job. He was dismissed by most employers:

Despite having programmed computers since age 8, I was rejected from about 20 programming jobs. (…) my experience writing code at a university, even on a product with 47,000 unique downloads, didn’t count as coding “experience”.

There is a hidden assumption on campus that academic jobs are hard to get, but industry jobs are easy. Many computer science professors assume that they and their students could easily land a job at Google or any other tech company nearby. Along with this belief goes the fact that whatever happens on campus is years ahead, and much more sophisticated, than what industry does. The story goes like this: government funds professors who have the ideas, they get their students to develop these ideas… and eventually these ideas end up getting picked up by industry when students get industry jobs. The story goes back to Vannevar Bush.

There is a problem however: this story does not match the facts. Employers do not recruit graduate students to get access to the work they did on campus. When a graduate student is recruited, he will be very lucky if his new employer has more than a passing interest in what he did on campus. It is not just employer reluctance: very few students could take what they learned as a graduate student and launch a business or a consulting venture.

The truth is that if you are fresh out of school, you will be the one doing most of the learning in industry. Even someone with a PhD can expect to be an apprentice for many years.

Also, let us be honest: the software produced on campus is rarely good. It is often made of untested, undocumented, and barely functioning prototypes. I have no doubt that Chand John wrote beautiful and maintainable software while at the university. However, I understand the skepticism of employers who hear “I wrote code as a student”. It is simply not a great reference. They hear “I wrote software for fun”.

So, people like Chand John end up with prize-winning research that is of little interest to anyone in industry. They wrote code on campus, but employers think “Oh! God! They will have to unlearn everything and start from scratch”. Is it any surprise that they are not offered the top programming jobs?

Of course, it is not entirely fair to say that Chand John couldn’t easily get an industry job. He does not tell us how selective he was. I am asked routinely by people from industry about clever graduate students. Presumably, what he couldn’t get easily is an interesting job. A job that would allow him to pay his student debts and offer him an intellectual challenge.

These jobs are scarce, both in industry and in academia.

Update: It looks like Chand works for Honda Research in what must be a desirable position.

Source: The idea for this post came to me from a G+ post by Suresh Venkatasubramanian.

Written by youryblog

January 17, 2015 at 6:39 PM

Posted in Uncategorized

Effective System Modeling

leave a comment »

Effective System Modeling (from

This post is a rough “transcript” (with some changes and creative freedom) of a session I gave in the Citi Innovation Lab, TLV about how to effectively model a system.

A Communication Breakdown?

Building complex software systems is not an easy task, for a lot of reasons. All kinds of solutions have been invented to tackle the different issues. We have higher level programming languages, DB tools, agile project management methodologies and quite a bit more. One could argue that these problems still exist, and no complete solution has been found so far. That may be true, but in this post, I’d like to discuss a different problem in this context: communicating our designs.

One problem that seems to be overlooked or not addressed well enough, is the issue of communicating our designs and system architecture. By ourselves, experienced engineers are (usually) quite capable of coming up with often elegant solutions to complex problems. But the realities and dynamics of a software development organization, especially a geographically distributed one, often require us to communicate and reason about systems developed by others.

We – software engineers – tend to focus on solving the technical issues or designing the systems we’re building. This often leads to forgetting that software development, especially in the enterprise, is often, if not always, a team effort. Communicating our designs is therefore critical to our success, but is often viewed as a negligible activity at best, if not a complete waste of time.

The agile development movement, in all its variants, has done some good to bring the issues of cooperation and communication into the limelight. Still, I often find that communication of technical details – structure and behavior of systems, is poorly done.

Why is that?

“Doing” Architecture

A common interpretation of agile development methods I often encounter tends to spill the baby with the water. I hear about people/teams refusing to do “big up-front design”. That in itself is actually a good thing in my opinion. The problem starts when this translates to no design at all, and this immediately translates into not wanting to spend time on documenting your architecture properly, or how it’s communicated.

But as anyone who’s been in this industry for more than a day knows – there’s no replacement for thinking about your design and your system, and agile doesn’t mean we shouldn’t design our system. So I claim that the problem isn’t really with designing per-se, but rather in the motivation and methodology we use for “doing” our architecture – how we go about designing the system and conveying our thoughts. Most of us acknowledge the importance of thinking about a system, but we do not invest the time in preserving that knowledge and discussion. Communicating a design or system architecture, especially in written form, is often viewed as superfluous, given the working code and its accompanying tests. From my experience this is often the case because the actual communication and documentation of a design are done ineffectively.

This was also strengthened after hearing Simon Brown talk about a similar subject, one which resonated with me. An architecture document/artifact should have “just enough” up front design to understand the system and create a shared vision. An architecture document should augment the code, not repeat it; it should describe what the code doesn’t already describe. In other words – don’t document the code, but rather look for the added value. A good architecture/design document adds value to the project team by articulating the vision on which all team members need to align on. Of course, this is less apparent in small teams than in large ones, especially teams that need to cooperate on a larger project.

As a side note I would like to suggest that besides creating a shared understanding and vision, an architecture document also helps in preserving the knowledge and ramping-up people onto the team. I believe that anyone who has tried learning a new system just by looking at its code will empathize with this.

Since I believe the motivation to actually design the system and solve the problem is definitely there, I’m left with the feeling that people often view the task of documenting it and communicating it as unnecessary “bureaucracy”.
We therefore need a way to communicate and document our system’s architecture effectively. A way that will allow us to transfer knowledge, over time and space (geographies), but still do it efficiently – both for the writer and readers.
It needs to be a way that captures the essence of the system, without drowning the reader in details, or burden the writer with work that will prove to be a waste of time. Looking at it from a system analysis point of view, then reading the document is quite possibly the more prominent use case, compared to writing it; i.e. the document is going to be read a lot more than written/modified.

When we come to the question of modeling a system, with the purpose of the end result being readable by humans, we need to balance the amount of formalism we apply to the model. A rigorous modeling technique will probably result in a more accurate model, but not necessarily an easily understandable one. Rigorous documents tend to be complete and accurate, but exhausting to read and follow; thereby beating the purpose we’re trying to achieve. At the other end of the scale are free text documents, often in English and sometimes with some scribbled diagrams, which explain the structure or behavior of system, often inconsistently. These are hard to follow for different reasons: inaccurate language, inconsistent terminology and/or ad-hoc (=unfamiliar) modeling technique used.

Providing an easy to follow system description, and doing so efficiently, requires us to balance these two ends. We need to have a “just enough” formalism that provides a common language. It needs to be intuitive to write and read, with enough freedom to provide any details needed to get a complete picture, but without burdening the writers and readers with unnecessary details.
In this post, I try to give an overview and pointers to a method I found useful in the past (not my invention), and that I believe answers the criteria mentioned above. It is definitely not the only way and may not suit everyone’s taste (e.g. Simon Brown suggests something similar but slightly different); but regardless of the method used, creating a shared vision, and putting it to writing is something useful, when done effectively.

System != Software

Before going into the technicalities of describing a system effectively, I believe we need to make the distinction between a system and its software.

For the purposes of our discussion, we’ll define software as a computer-understandable description of a dynamic system; i.e. one way to code the structure and behavior of a system in a way that’s understandable by computers.
A (dynamic) system on the other hand is what emerges from the execution of software.

To understand the distinction, an analogy might help: consider the task of understanding the issue of global warming (the system) vs. understanding the structure of a book about global warming (the software).

  • Understanding the book structure does not imply understanding global warming. Similarly, understanding the software structure doesn’t imply understanding the system.
  • The book can be written in different languages, but it’s still describing global warming. Similarly, software can be implemented using different languages and tools/technologies, but it doesn’t (shouldn’t) change the emergent behavior of the system.
  • Reading the content of the book implies understanding global warming. Similarly, the system is what emerges from execution of the software.

One point we need to keep in mind, and where this analogy breaks, is that understanding a book’s structure is considerably easier than understanding the software written for a given system.
So usually, when confronted with the need to document our system, we tend to focus on documenting the software, not the system. This leads to ineffective documentation/modeling (we’re documenting the wrong thing), eventually leading to frustration and missing knowledge.
This is further compounded by the fact that existing tools and frameworks for documentation of software (e.g. UML) tend to be complex and detailed, and with the tools emphasizing code generation, and not human communication; this is especially true for UML.

Modeling a System

When we model an existing system, or design a new one, we find several methods and tools that help us. A lot of these methods define all sorts of views of the system – describing different facets of its implementation. Most practitioners have surely met one or more different “types” of system views: logical, conceptual, deployment, implementation, high level, behavior, etc. These all provide some kind of information as to how the system is built, but there’s not a lot of clarity on the differences or roles of each such view. These are essentially different abstractions or facets of the given system being modeled. While any such abstraction can be justified in itself, it is the combination of these that produces an often unreadable end result.

So, as with any other type of technical document you write, the first rule of thumb is:

Rule of thumb #1: Tailor the content to the reader(s), and be explicit about it.

In other words – set expectations. Set the expectation early on – what you’re describing and what is the expected knowledge (and usually technical competency) of the reader.

Generally, in my experience, 3 main facets are the most important ones: the structure of the system – how it’s built, the behavior of the system – how the different component interact on given inputs/events, and the domain model used in the system. Each of these facets can be described in more or less detail, at different abstraction levels, and using different techniques, depending on the case. But these are usually the most important facets for a reader to understand the system and approach the code design itself, or reading the code.

Technical Architecture Modeling

One method I often find useful is that of Technical Architecture Modeling (TAM), itself a derivative of Fundamental Modeling Concepts (FMC). It is a formal method, but one which focuses on human comprehension. As such, it borrows from UML and FMC, to provide a level of formalism which seems to strike a good balance between readability and modeling efficiency. TAM uses a few diagram types, where the most useful are the component/block diagram used to depict a system’s structure or composition; the activity and sequence diagrams used to model a system/component’s behavior and the class diagram used to model a domain (value) model. In addition, other diagram types are also included, e.g. state charts and deployment
diagrams; but these are less useful in my experience. In addition, TAM also has some tool support in the form of Visio stencils that make it easier to integrate this into other documentation methods.

I briefly discuss how the most important facets of a system can be modeled with TAM, but the reader is encouraged to follow the links given above (or ask me) for further information and details.

Block Diagram: System Structure

A system’s structure, or composition, is described using a simple block diagram. At its simplest form, this diagram describes the different components that make up the system.
For example, describing a simple travel agency system, with a reservation and information system can look something like this (example taken from the FMC introduction):

Sample: Travel Agency System

This in itself already tells us some of the story: there’s a travel agency system, accessed by customers and other interested parties, with two subsystems: a reservation system and an information help desk system. The information is read and written to two separate data stores holding the customer data and reservations in one store, and the travel information (e.g. flight and hotel information) in the other. This data is fed into the system by external travel-related organizations (e.g. airlines, hotel chains), and reservations are forwarded to the same external systems.

This description is usually enough to provide at least a contextual high level information of the system. But the diagram above already tells us a bit more. It provides us some information about the access points to the data; about the different kinds of data flowing in the system, and what component is interacting with what other component (who knows who). Note that there is little to no technical information at this point.

The modeling language itself is pretty straightforward and simple as well: we have two main “entities”: actors and data stores.
Actors, designated by square rectangles, are any components that do something in the system (also humans). They are they active components of the system. Actors communicate with other actors through channels (lines with small circles on them), and the read/write from/to data stores (simple lines with arrow heads). Examples include services, functions and human operators of the system.
Data store, designated by round rectangles (/circles), are passive components. These are “places” where data is stored. Examples include database systems, files, and even memory arrays (or generally any data structure).

Armed with these definitions, we can already identify some useful patterns, and how to model them:

Read only access – actor A can only read from data store S:
Read only access


Write only access – actor A can only write to data store S:
Write only access


Read/Write access:
Read/Write access


Two actors communicating on a request/response channel have their own unique symbol:
In this case, actor ‘B’ requests something from actor ‘A’ (the arrow on the ‘R’ symbol points to  ‘A’), and ‘A’ answers back with data. So data flow actually happens in both ways. A classical example of this is a client browser asking for a web page from a web server.


A simple communication over a shared storage:
actors ‘A’ and ‘B’ both read and write from/to data store ‘S’. Effectively communicating over it.


There’s a bit more to this formalism, which you can explore in FMC/TAM website. But not really much more than what’s shown here. These simple primitives already provide a powerful expression mechanism to convey most of the ideas we need to communicate over our system on a daily basis.

Usually, when providing such a diagram, it’s good practice to accompany it with some text that provides some explanation on the different components and their roles. This shouldn’t be more than 1-2 paragraphs, but actually depends on the level of detail and system size.

This would generally help with two things: identifying redundant components, and describing the responsibility of each component clearly. Think of this text explanation as a way to validate your modeling, as displayed in the diagram.

Rule of thumb #2: If your explanation doesn’t include all the actors/stores depicted in the
diagram – you probably have redundant components.

Behavior Modeling

The dynamic behavior of a system is of course no less important than its structure. The cooperation, interaction and data flow between components allow us to identify failure points, bottlenecks, decoupling problems etc. In this case, TAM adopts largely the UML practice of using sequence diagrams or activity diagrams, whose description is beyond the scope of this post.

One thing to keep in mind though, is that when modeling behavior in this case, you’re usually not modeling interaction between classes, but rather between components. So the formalism of “messages” sent between objects need not couple itself to code structure and class/method names. Remember: you generally don’t model the software (code), but rather system components. So you don’t need to model the exact method calls and object instances, as is generally the case with UML models.

One good way to validate the model at this point is to verify that the components mentioned in the activity diagram are mentioned in the system’s structure (in the block diagram); and that components that interact in the behavioral model actually have this interaction expressed in the structural model. A missing interaction (e.g. channel) in the structural view may mean that these two components have an interface that wasn’t expressed in the structural model, i.e. the structure diagram should be fixed; or it could mean that these two components shouldn’t interact, i.e. the behavioral model needs to be fixed.

This is the exact thought process that this modeling helps to achieve – modeling two different facets of the system and validating one with the other in iterations allows us to reason and validate our understanding of the system. The explicit diagrams are simply the visual method that helps us to visualize and capture those ideas efficiently. Of course, keep in mind that you validate the model at the appropriate level of abstraction – don’t validate a high level system structure with a sequence diagram describing implementation classes.

Rule of thumb #3: Every interaction modeled in the behavioral model (activity/sequence
diagrams) should be reflected in the structural model (block diagram), and vice versa.

Domain Modeling

Another often useful aspect of modeling a system is modeling the data processed by the system. It helps to reason about the algorithms, expected load and eventually the structure of the code. This is often the part that’s not covered by well known patterns and needs to be carefully tuned per application. It also helps in creating a shared vocabulary and terminology when discussing different aspects of the developed software.

A useful method in the case of domain modeling is UML class diagrams, which TAM also adopts. In this case as well, I often find a more scaled-down version the most useful, usually focused on the main entities, and their relationships (including cardinality). The useful notation of class diagrams can be leveraged to express these relationships quite succinctly.

Explicit modeling of the code itself is rarely useful in my opinion – the code will probably be refactored way faster than a model will be updated, and a reader who is able to read a detailed class diagram can also read the code it describes. One exception to this rule might be when your application deals with code constructs, in which case the code constructs themselves (e.g. interfaces) serve as the API to your system, and clients will need to write code that integrates with it, as a primary usage pattern of the system. An example for this is an extensible library of any sort (eclipse plugins are one prominent example, but there are more).

Another useful modeling facet in this context is to model the main concepts handled in the system. This is especially useful in very technical systems (oriented at developers), that introduce several new concepts, e.g. frameworks. In this case, a conceptual model can prove to be useful for establishing a shared understanding and terminology for anyone discussing the system.

Iterative Refinement

Of course, at the end of the day, we need to remember that modeling a system in fact reflects a thought process we have when designing the system. The end product, in the form a document (or set of documents) represents our understanding of the system – its structure and behavior. But this is never a one-way process. It is almost always an iterative process that reflects our evolving understanding of the system.

So modeling a specific facet of the system should not be seen as a one-off activity. We often follow a dynamic where we model the structure of the system, but then try to model its behavior, only to realize the structure isn’t sufficient or leads to a suboptimal flow. This back and forth is actually a good thing – it helps us to solidify our understanding and converge on a widely understood and accepted picture of how the system should look, and how it should be constructed.

Refinements also happen on the axis of abstractions. Moving from a high level to a lower level of abstraction, we can provide more details on the system. We can refine as much as we find useful, up to the level of modeling the code (which, as stated above, is rarely useful in my opinion). Also when working on the details of a given view, it’s common to find improvement points and issues in the higher level description. So iterations can happen here as well.

As an example, consider the imaginary travel agency example quoted above. One possible refinement of the structural view could be something like this (also taken from the site above):

Example: travel agency system refined

In this case, more detail is provided on the implementation of the information help subsystem and the ‘Travel Information’ data store. Although providing some more (useful) technical details, this is still a block diagram, describing the structure of the system. This level of detail refines the high level view shown earlier, and already provides more information and insight into how the system is built. For example, how the data stores are implemented and accessed, the way data is adapted and propagated in the system. The acute reader will note that the ‘Reservation System’ subsystem now interacts with the ‘HTTP Server’ component in the ‘Information help desk’ subsystem. This makes sense from a logical point of view – the reservation system accesses the travel information through the same channels used to provide information to other actors, but this information was missing from the first diagram (no channel between the two components).
One important rule of thumb is that as you go down the levels of abstraction, keep the names of actors presented in the higher level of abstraction. This allows readers to correlate the views more easily, identify the different actors, and reason about their place in the system. It provides a context for the more fine granular details. As the example above shows, the more detailed diagram still includes the actor and store names from the higher level diagram (‘Travel Information’, ‘Information help desk’, ‘Travel Agency’).

Rule of thumb #4: Be consistent about names when moving between different levels of abstraction. Enable correlations between the different views.

Communicating w/ Humans – Visualization is Key

With all this modeling activity going on, we have to keep in mind that our main goal, besides good design, is communicating this design to other humans, not machines. This is why, reluctant as we are to admit it (engineers…) – aesthetics matter.

In the context of enterprise systems, communicating the design effectively is as important to the quality of the resulting software as designing it properly. In some cases, it might be even more important – just consider the amount of time you sometime spend on integration of system vs. how much time you spend writing the software itself. So a good looking diagram is important, and we should be mindful about how we present it to the intended audience.

Following are some tips and pointers on what to look for when considering this aspect of communicating our designs. This is by no means an exhaustive list, but more based on experience (and some common sense). More pointers can be found in the links above, specifically in the visualization guide.

First, keep in mind node and visual arrangement of nodes and edges in your diagram immediately lends itself to how clear the diagram is to readers. Try to minimize intersection of edges, and align edges on horizontal and vertical axes.
Compare these two examples:

Aligning vertices

The arrangement on the left is definitely clearer than the one on the right. Note that generally speaking, the size of a node does not imply any specific meaning; it is just a visual convenience.

Similarly, this example:

Visual alignment

shows how the re-arrangement of nodes allows for less intersection, without losing any meaning.

Colors can also be very useful in this case. One can use colors to help distinguish between different levels of containment:

Using colors

In this case, the usage of colors helps to distinguish an otherwise confusing structure. Keep in mind that readers might want to print the document you create on a black and white printer (and color blind) – so use high contrast colors where possible.

Label styles are generally not very useful to convey meaning. Try to stick to a very specific font and be consistent with it. An exception might be a label that pertains to a different aspect, e.g. configuration files or code locations, which might be more easily distinguished when using a different font style.

Visuals have Semantics

One useful way to leverage colors and layout of a diagram is to stress specific semantics you might want to convey in your diagram. One might leverage colors to distinguish a set of components from other components, e.g. highlighting team responsibilities, or highlight specific implementation details. Note that when you use this kind of technique that it is not standard, so remember to include an explanation – a legend – of what the different colors mean. Also, too many colors might cause more clutter, eventually beating the purpose of clarity.

Another useful technique is to use layout of the nodes in the graph for conveying an understanding. For example, depicting the main data flow might be hinted in the block diagram by layouting the nodes from left to right, or top to down. This is not required, nor carries any specific meaning. But it is often useful to use, and provides hints as to how the system actually works.


As we’ve seen, “doing” architecture, while often perceived as a cumbersome and unnecessary activity isn’t hard to do when done effectively. We need to keep in mind the focus of this activity: communicating our designs and reasoning about them over longer periods of time.

Easing the collaboration around design is not just an issue of knowledge sharing (though that’s important as well), but it is a necessity when trying to build software across global teams, over long periods of time. How effectively we communicate our designs directly impacts how we collaborate, the quality of produced software, how we evolve it over time, and eventually the bottom line of deliveries.

I hope this (rather long) post has served to shed some light on the subject, and provide some insight, useful tips and encouraged people to invest some efforts into learning further.

Written by youryblog

January 17, 2015 at 6:27 PM

Posted in SW Design, SW Eng./Dev.

House in the snow How much you need to earn to buy a house in every major Canadian city

leave a comment »

I like this information


House in the snow

How much you need to earn to buy a house in every major Canadian city

Elizabeth Bromstein| Jan 6, 2015 01:07 pm


Many say property is the best investment you can make. Bursting housing bubbles and mortgage scandals aside, they’re usually right.

The price of making that investment varies widely in Canada, depending on where you live. We looked at how much you need to earn to buy an average-priced house in every major Canadian city.

To get these numbers, we consulted Adrian Williams, a Toronto mortgage broker, and used his calculator found here. He explained that to calculate the income required you need to know the purchase price, down payment, rate, utilities – mortgage qualifying must include a minimum of $100 a month for heating costs – and taxes.

We got the average purchase price per city from the Canadian Real Estate Association (they fluctuate. These prices were effective at the end of December. To see the absolute latest click here and input your city), and Williams provided the property tax rates. At his suggestion we used a 2.99% interest rate, which is the average qualifying rate for a 5-year fixed term. We used a down payment of 10% of the purchase price and calculated $100 a month for utilities.

According to Williams, “Other factors that will be included with mortgage qualification are the total monthly payment obligations from credit card, LOC’s, personal & car loans, car lease and other types of credit that require a monthly payment.”

Here is what you need to earn to buy a house in every major Canadian market. (Numbers are rounded to the nearest dollar.)


Average price: $819,336

Monthly mortgage payment: $3,570

Property tax: $251

Income required: $147,023

Jobs in Vancouver


Average price: $465,047

Mortgage mortgage payment: $2,026

Property taxes: $236

Income required: $88,578

Jobs in Calgary


Average price: $365,520

Mortgage payment: $1,592

Property tax: $244

Salary required: $72,617

Jobs in Edmonton


Average price: $331,161

Monthly mortgage payment: $1,443

Property tax: $378

Income required: $72,028

Jobs in Regina


Average price: $349,322

Monthly mortgage payment: $1,522

Property tax: $366

Income required: $74,546

Jobs in Saskatoon


Average price: $270,605

Monthly mortgage payment: $1,179

Property tax: $274

Income required: $58,235

Jobs in Winnipeg


Average price: $357,887

Monthly mortgage payment: $1,559

Property tax: $336

Income required: $74,820.28

Jobs in Ottawa


Average price: $587,505

Monthly mortgage payment: $2,560

Property tax: $354

Income required: $113,009

Jobs in Toronto


Average price: $344,273

Monthly mortgage payment: $1,500

Property tax: $237

Income required: $68,884

Jobs in Montreal


Average price: $264,447

Monthly mortgage payment: $1,152

Property tax: $266

Income required: $56,929

Jobs in Halifax


Recent on Workopolis:

Our top tech predictions for 2015

Eek! The Most embarrassing job interview mistakes

Follow Workopolis

Written by youryblog

January 9, 2015 at 5:28 PM

Posted in Business, Interesting

Over 70% of the cost (time) of developing a program goes out after it has been released +

leave a comment »

Thu, 1 Jan 2015

Actually I found that the usually the ones that find it the most fascinating
write the least legible code because they never bother with software
engineering and design.

You can get a high school wiz kid to write the fastest code there is, but
there is no way you will be able to change anything about it five minutes

Considering that over 70% of the cost (time) of developing a program goes
out after it has been released, when changes start to be asked for, that is
a problem.

Micha Feigin
Csail-related mailing list

Interesting view on student's grade: Dear Student: No, I Won’t Change the Grade You Deserve

Written by youryblog

January 2, 2015 at 10:04 PM

The Tears of Donald Knuth

leave a comment »

Donald Knuth

Donald Knuth

In this column I will be looking at the changing relationship between the discipline of computer science and the growing body of scholarly work on the history of computing, beginning with a recent plea made by renowned computer scientist Donald Knuth. This provides an opportunity to point you toward some interesting recent work on the history of computer science and to think more broadly about what the history of computing is, who is writing it, and for whom they are writing.

Last year historians of computing heard an odd rumor: that Knuth had given the Kailath lecture at Stanford University and spent the whole time talking about us. Its title, “Let’s Not Dumb Down the History of Computer Science,” was certainly intriguing, and its abstract confirmed that some forceful positions were being taken.a The online video eventually showed something remarkable: his lecture focused on a single paper, Martin Campbell-Kelly’s 2007 “The History of the History of Software.”6,b Reading it had deeply saddened Knuth, who “finished reading it only with great difficulty” through his tear-stained glasses.

Back to Top

What Knuth Said

Knuth began by announcing that, despite an aversion to confrontation, he would be “flaming” historians of computing. This, he worried “could turn out to be the biggest mistake of my life.” The bout might nevertheless be seen as a mismatch. Knuth is among the world’s most celebrated computer scientists, renowned for his ongoing project to classify and document families of algorithms in The Art of Computer Programming and for his creation of the TeX computerized typesetting system ubiquitous within computer science and mathematics. Campbell-Kelly has a similar prominence within the much smaller community of historians of computing but, even by Google Scholar’s generous definitions, the paper that saddened Knuth has been cited only nine times.

Knuth then enumerated his motivations, as a computer scientist, to read the history of science. First, reading history helped him to understand the process of discovery. Second, understanding the difficulty and false starts experienced by brilliant historical scientists in making discoveries that specialists now find obvious helped him to see what made concepts challenging to students and thus to become a “much better writer and teacher.” Third, appreciating the historical contribution of non-Western scientists helped in “celebrating the contributions of many cultures.” Fourth, history is the craft of telling stories, which is “the best way to teach, to explain something.” Fifth, the biographies of scientists teach tactics for a successful and rewarding career. Sixth, history teaches how human experience has changed over time. As humans we should care about that.

Knuth also identified some special contributions to the history of science that professionally trained historians are uniquely well placed to make. We are good at “smoking out” primary sources and putting historical activities in the context of broader timelines. He also appreciates our ability to translate papers written in languages that he cannot himself read. He finds attempts at historical analysis “probably the least interesting” aspects of our papers but appreciates lengthy quotations from primary sources.

Things then headed in a less positive direction. Knuth explained that Campbell-Kelly had centered his paper on a table of important works related to the history of software published between 1967 and 2004. It coded the predominant approaches into four categories—one of which was technical—to demonstrate the technical approach had been dominant until about 1990, dwindling thereafter and vanishing altogether after 1997. Campbell-Kelly characterized this as an “evolution” away from “technical histories” of the “low-hanging-fruit variety” written by Knuth and other “outstanding technical experts” that were “constrained, excessively technical, and lacking in breadth of vision.”

Knuth had previously viewed Campbell-Kelly as a kindred spirit but had now been granted a glimpse of “what historians say when they’re talking to historians instead of when they’re talking to people like me.” Without pausing to dry his glasses he had written to Campbell-Kelly to accuse him of having “lost faith in the notion that computer science is actually scientific.”

So why is the history of computer science not being written in the volume it deserves, or the manner favored by Knuth?

The shift described by Campbell-Kelly reflected a change in the population of scholars writing the history of computing. Many of the senior computing figures of the 1970s worked to preserve the history of the 1940s and early 1950s, starting with a number of “pioneer days” and workshops organized. The most important of these was held at Los Alamos National Laboratory in 1976.15 Most of the 90 participants included in the group photograph of attendees were computer pioneers of the 1940s. Knuth himself contributed a detailed history of the first tools for “automatic programming” (assemblers and compilers). He was one of a handful of interested younger computer scientists who entered the field in the 1950s, which also included Edsger Dijkstra and Brian Randell, a systems programmer turned academic who had assembled an important collection of reprinted historical documents. At the conference were only a handful of trained historians. The editorial board of Annals of the History of Computing, which began in 1979 as a publication of AFIPS, a long-defunct umbrella group for professional computing societies, had a similar makeup. As graduate students in history and history of science programs began to write dissertations on computer-related topics they eventually inverted the ratio of trained historians to computer scientists, though the journal continues to publish a significant number of papers by computer scientists and technical experts.

In his lecture Knuth worried that a “dismal trend” in historical work meant that “all we get nowadays is dumbed down” through the elimination of technical detail. According to Knuth “historians of math have always faced the fact that they won’t be able to please everybody.” He feels that other historians of science have succumbed to “the delusion that … an ordinary person can understand physics …”

I am going to tell you why Knuth’s tears were misguided, or at least misdirected, but first let me stress that historians of computing deeply appreciate his conviction that our mission is of profound importance. Indeed, one distinguished historian of computing recently asked me what he could do to get flamed by Knuth. Knuth has been engaged for decades with history. This is not one of his passionate interests outside computer science, such as his project reading verses 3:16 of different books of the Bible. Knuth’s core work on computer programming reflects a historical sensibility, as he tracks down the origin and development of algorithms and reconstructs the development of thought in specific areas. For years advertisements for IEEE Annals of the History of Computing, where Campbell-Kelly’s paper was published, relied on a quote from Knuth that it was the only publication he read from cover to cover. With the freedom to choose a vital topic for a distinguished lecture Knuth chose to focus on history rather than one of his better-known scientific enthusiasms such as literate programming or his progress with The Art of Computer Programming.

Back to Top

Computing vs. Computer Science

Here is where I part ways with Knuth’s interpretation. Campbell-Kelly’s article was “The History of the History of Software,” not “The History of the History of Computer Science.” Knuth’s complaint that historians have been led astray by fads and pursuit of a mass audience into “dumbed down” history reflects an assumption that computer science is the whole of computing, or at least the only part in which historians can find important questions about software. This conflated the history of computing with the history of computer science. Distinguished computer scientists are prone to blur their own discipline, and in particular few dozen elite programs, with the much broader field of computing. The tools and ideas produced by computer scientists underpin all areas of IT and make possible the work carried out by network technicians, business analysts, help desk workers, and Excel programmers. That does not make those workers computer scientists. The U.S. alone is estimated to have more than 10 million “information technology workers,” which is about a hundred times more than the ACM’s membership. Vint Cerf has warned in Communications that even the population of “professional programmers” dwarfs the association’s membership.7 ACM’s share of the IT workforce has been in decline for a half-century, despite efforts begun back in the 1960s and 1970s by leaders such as Walter Carlson and Herb Grosch to broaden its appeal.

Computing is much bigger than computer science, and so the history of computing is much bigger than the history of computer science. Yet Knuth treated Campbell-Kelly’s book on the business history of the software industry (accurately subtitled “a history of the software industry”) and all the rest of the history of computing as part of “the history of computer science.”4 Others have written about the history of computer use in life insurance and other areas of business, the history of cybernetics, the history of the semiconductor industry, the history of punched card machines, the history of the IT workforce, the history of computer-producing companies such as IBM, the use and development of computers in particular countries, the history of the personal computer, and the history of computer usage in particular areas of scientific practice such as bio-medicine. To call such work “dumbed down” history of computer science, rather than smart history of many other things, is to misunderstand both the intentions and the accomplishments of its authors.

The truth is that regrettably little history of computer science, whether dumb or deep, has been written by trained historians even though the history of computing literature as a whole has been expanding rapidly. Consider our output between 1990 and 2010. Michael Mahoney, a historian of science and mathematics at Princeton University, worked on a narrative history of theoretical computer science but ultimately produced only a set of provocative but schematic papers.13 Mahoney was also interested in the history of software engineering, and several other historians have discussed the 1968 NATO Conference on Software Engineering at which that field was launched. Eminent sociologist of science Donald MacKenzie worked on the history of formal methods and its relationship to the development of computer technology.11,12 Two books explored the history of DARPA and its role in shaping the development of computer science and technology, though Knuth would not approve of their institutional focus.17,19 William Aspray wrote several papers on the history of NSF support for computing2 and a book on John von Neumann.1 A complete list would be longer, but not that much longer.

Back to Top

Historical Careers in Computer Science

So why is the history of computer science not being written in the volume it deserves, or the manner favored by Knuth? I am, at heart, a social historian of science and technology and so my analysis of the situation is grounded in disciplinary and institutional factors. Books of this kind would demand years of expert research and sell a few hundred copies. They would thus be authored by those not expected to support themselves with royalties, primarily academics.

Academic careers are profoundly shaped by the disciplinary communities in which they develop. Throughout their training, scholars are socialized into the culture of their field and pick up a wealth of tacit and explicit knowledge on what is expected of them. They learn how to select a research project, what kinds of work are noticed and which are ignored, what style to write in, how to structure a paper, which professors are respected, what search committees and grant review panels are looking for. This continues throughout their careers, as they aspire to prestigious awards, named chairs, or favors from the Dean. Whether they realize it or not, successful academics have internalized the rules of the game played in their particular field.

The history of computer science might be undertaken from two disciplinary base camps within academia: computer science and the history of science. Someone whose primary training is in history will naturally see the history of computing differently from someone whose disciplinary loyalty is to computer science. They will choose different topics and explore them in different ways for different audiences. For different reasons, outlined below, neither group has shown much interest in supporting work of the kind favored by Knuth. That is why it has rarely been written.

Back to Top

Prospects within the History of Science

The history of science is a kind of history, which is in turn part of the humanities. Some historians of science are specialists within broad history departments, and others work in specialized programs devoted to science studies or to the history of science, technology, or medicine. In both settings, historians judge the work of prospective colleagues by the standards of history, not those of computer science. There are no faculty jobs earmarked for scholars with doctoral training in the history of computing, still less in the history of computer science. The persistently brutal state of the humanities job market means that search committees can shortlist candidates precisely fitting whatever obscure combination of geographical area, time period, and methodological approaches are desired. So a bright young scholar aspiring to a career teaching and researching the history of computer science would need to appear to a humanities search committee as an exceptionally well qualified historian of the variety being sought (perhaps a specialist in gender studies or the history of capitalism) who happens to work on topics related to computing.

This, more than anything else, explains the rise of the broad and non-technical approaches decried by Knuth. Work in the history of computing has been seen by most in the humanities as dull and provincial, excessively technical and devoid of big historical ideas. Whereas fields such as environmental history have produced widely recognized classics that convince non-specialists of the scholarly potential, historians of computing are still inching toward broad acceptance of their relevance. The roles Knuth outlined for them would not serve them well as they were essentially those of the research assistant: gather primary materials, translate them if necessary, and make them available to computer scientists who will do the analysis.

Current enthusiasm for the “digital humanities” and the inescapable importance of computing to the modern world could provide opportunities. One day humanities search committees might even seek out historians of computing, but only those whose work engages with and appeals to scholars who themselves know nothing of computer science. In the meantime many scholars with doctorates in the history of computing have found work in museums or in academic employment outside both history and computer science, for example, in business schools, information schools, or specialist programs such as engineering education. These positions pose their own disciplinary challenges, but for obvious reasons provide few incentives to study the history of computer science.

Back to Top

Prospects within Computer Science

Thus the kind of historical work Knuth would like to read would have to be written by computer scientists themselves. Some disciplines support careers spent teaching history to their students and writing history for their practitioners. Knuth himself holds up the history of mathematics as an example of what the history of computing should be. It is possible to earn a Ph.D. within some mathematics departments by writing a historical thesis (euphemistically referred to as an “expository” approach). Such departments have also been known to hire, tenure, and promote scholars whose research is primarily historical. Likewise medical schools, law schools, and a few business schools have hired and trained historians. A friend involved in a history of medicine program recently told me that its Ph.D. students are helped to shape their work and market themselves differently depending on whether they are seeking jobs in medical schools or in history programs. In other words, some medical schools and mathematics departments have created a demand for scholars working on the history of their disciplines and in response a supply of such scholars has arisen.

As Knuth himself noted toward the end of his talk, computer science does not offer such possibilities. As far as I am aware no computer science department in the U.S. has ever hired as a faculty member someone who wrote a Ph.D. on a historical topic within computer science, still less someone with a Ph.D. in history. I am also not aware of anyone in the U.S. having been tenured or promoted within a computer science department on the basis of work on the history of computer science. Campbell-Kelly, now retired, did both things (earning his Ph.D. in computer science under Randell’s direction) but he worked in England where reputable computer science departments have been more open to “fuzzy” topics than their American counterparts. Neither are the review processes and presentation formats at prestigious computer conferences well suited for the presentation of historical work. Nobody can reasonably expect to build a career within computer science by researching its history.

In its early days the history of computing was studied primarily by those who had already made their careers and could afford to indulge pursuing historical interests from tenured positions or to dabble after retirement. Despite some worthy initiatives, such as the efforts of the ACM History Committee to encourage historical projects, the impulse to write technical history has not spread widely among younger generations of distinguished and secure computer scientists.

To summarize, the upper-right quadrant in the accompanying table is essentially empty. It reflects historical work forming the backbone of a scholarly career and intended as a contribution to computer science. I share Knuth’s regret that the technical history of computer science is greatly understudied. The main cause is that computer scientists have lost interest in preserving the intellectual heritage of their own discipline. It is not, as Knuth implies, that Campbell-Kelly is representative of a broader trend of individual researchers deciding to stop writing one kind of history and to devote a fixed pool of talent to writing another kind instead. There is no zero sum game here. More work by professionally trained historians on social, institutional, and cultural aspects of computing does not have to mean less work by computer scientists themselves. They cannot count on history departments to do this for them, and I hope Knuth’s lament motivates a few to follow his lead in this area. Not simply because Knuth did it—few computer scientists have emulated him by procuring their own domestic pipe organs—but because his commitment to the intellectual history of computer science makes a powerful argument that historical knowledge of a particular kind is a prerequisite for deep technical understanding.

Back to Top

Reopening the Black Box

I will end on a positive note. In his paper, Campbell-Kelly offered a “biographical mea culpa” for his own early work that he now reads with a “mild flush of embarrassment.” He came to see his erstwhile enthusiasm for technical history as a youthful indiscretion and his conversion to business history as an act of redemption, paralleling his own development and that of the field in a way that relied implicitly on a rather unfashionable conceptualization of history as progress along a fixed trajectory.

Contrary both to Knuth’s despair and to Campbell-Kelly’s story of a march of progress away from technical history, some scholars with formal training in history and philosophy have been turning to topics with more direct connections to computer science over the past few years. Liesbeth De Mol and Maarten Bullynck have been working to engage the history and philosophy of mathematics with issues raised by early computing practice and to bring computer scientists into more contact with historical work.3 Working with like-minded colleagues, they helped to establish a new Commission for the History and Philosophy of Computing within the International Union of the History and Philosophy of Science. Edgar Daylight has been interviewing famous computer scientists, Knuth included, and weaving their remarks into fragments of a broader history of computer science.8 Matti Tedre has been working on the historical shaping of computer science and its development as a discipline.22 The history of Algol was a major focus of the recent European Science Foundation project Software for Europe. Algol, as its developers themselves have observed, was important not only for pioneering new capabilities such as recursive functions and block structures, but as a project bringing together a number of brilliant research-minded systems programmers from different countries at a time when computer science had yet to coalesce as a discipline.c Pierre Mounier-Kuhn has looked deeply into the institutional history of computer science in France and its relationship to the development of the computer industry.16

Stephanie Dick, who recently earned her Ph.D. from Harvard, has been exploring the history of artificial intelligence with close attention to technical aspects such as the development and significance of the linked list data structure.d Rebecca Slayton, another Harvard Ph.D., has written about the engagement of prominent computer scientists with the debate on the feasibility of the “Star Wars” missile defense system; her thesis has been published as an MIT Press book.20 At Princeton, Ksenia Tatarchenko recently completed a dissertation on the USSR’s flagship Akademgorodok Computer Center and its relationship to Western computer science.21 British researcher Mark Priestley has written a deep and careful exploration of the history of computer architecture and its relationship to ideas about computation and logic.18 I have worked with Priestly to explore the history of ENIAC, looking in great detail at the functioning and development of what we believe to be the first modern computer program ever executed.9 Our research engaged with some of the earliest historical work on computing, including Knuth’s own examination of John von Neumann’s first sketch of a modern computer program10 and Campbell-Kelly’s technical papers on early programming techniques.5

The history of computer science retains an important place within the diverse and growing field of the history of computing.

Most of this new work is aimed primarily at historians, philosophers, or science studies specialists rather than computer scientists. However, it does not shy away from engagement with the specifics of computer technology or the detailed workings of the computer science community, re-introducing technical analysis along with continued attention to social, cultural, and institutional factors. Some of it may confirm Campbell-Kelly’s prediction that the field will move toward “holistic” work integrating different approaches.

The history of computer science retains an important place within the diverse and growing field of the history of computing. Work of the particular kind preferred by Knuth will flourish only if his colleagues in computer science are willing to produce, reward, or commission it. I nevertheless hope he will continue to find much value in the work of historians and that we will rarely give him cause to reach for his handkerchief.

Back to Top


1. Aspray, W. John von Neumann and the Origins of Modern Computing. MIT Press, Cambridge, MA, 1990.

2. Aspray, W. and Williams, B.O. Arming American scientists: NSF and the provision of scientific computing facilities for universities, 1950–73. IEEE Annals of the History of Computing 16, 4 (Winter 1994), 60–74.

3. Bullynck, M. and De Mol, L. Setting-up early computer programs: D.H. Lehmer’s ENIAC computation. Archive of Mathematical Logic 49 (2010), 123–146.

4. Campbell-Kelly, M. From Airline Reservations to Sonic the Hedgehog: A History of the Software Industry. MIT Press, Cambridge, MA, 2003.

5. Campbell-Kelly, M. Programming the EDSAC: Early programming activity at the University of Cambridge. Annals of the History of Computing 2, 1 (Jan. 1980), 7–36.

6. Campbell-Kelly, M. The history of the history of software. IEEE Annals of the History of Computing 29, 4 (Oct.-Dec. 2007), 40–51.

7. Cerf, V. ACM and the professional programmer. Commun. ACM 57, 8 (Aug. 2014), 7.

8. Daylight, E.G. The Dawn of Software Engineering: From Turing to Dijkstra. Lonely Scholar, Heverlee, Belgium, 2012.

9. Haigh, T., Priestley, M., and Rope, C. Los Alamos bets on ENIAC: Nuclear Monte Carlo simulations, 1947–48. IEEE Annals of the History of Computing 36, 2 (Jan.-Mar. 2014), 42–63.

10. Knuth, D.E. Von Neumann’s first computer program. ACM Computing Surveys 2, 4 (Dec. 1970), 247–260.

11. MacKenzie, D. Knowing Machines. MIT Press, Cambridge, MA, 1998.

12. MacKenzie, D. Mechanizing Proof. MIT Press, Cambridge, MA, 2001.

13. Mahoney, M.S. and Haigh, T., Eds. Histories of Computing. Harvard University Press, Cambridge, MA, 2011.

14. Matti, T. The Science of Computing: Shaping a Discipline. CRC Press/Taylor & Francis, 2014.

15. Metropolis, N., Howlett, J. and Rota, G.-C. Eds. A History of Computing in the Twentieth Century: A Collection of Papers. Academic Press, New York, 1980.

16. Mounier-Kuhn, P. Logic and computing in France: A late convergence. In Proceedings of the Symposium on the History and Philosophy of Programming (Birmingham, July 2012);

17. Norberg, A.L. and O’Neill, J.E. Transforming Computer Technology: Information Processing for the Pentagon, 1962–1986. Johns Hopkins University Press, Baltimore, MD, 1996.

18. Priestley, M. A Science of Operations: Machines, Logic, and the Invention of Programming. Springer, New York, 2011.

19. Roland, A. and Shiman, P. Strategic Computing: DARPA and the Quest for Machine Intelligence. MIT Press, Cambridge, MA, 2002.

20. Slayton, R. Arguments that Count: Physics, Computing, and Missile Defense, 1949–2012. MIT Press, Cambridge, MA, 2013.

21. Tatarchenko, K. A House With the Window to the West: The Akademgorodok Computer Center (1958–1993). Princeton, 2013.

22. Tedre, M. The Science of Computing: Shaping a Discipline. CRC Press/Taylor & Francis, 2014.

Back to Top


Thomas Haigh ( is an associate professor of information studies at the University of Wisconsin, Milwaukee, and immediate past chair of the SIGCIS group for historians of computing.

Back to Top


a. See

b. The video is posted at

c. IEEE Annals of the History of Computing 36, 4 (Oct.–Dec. 2014) is a special issue based on this work.

d. Dick had earlier published “AfterMath: The Work of Proof in the Age of HumanMachine Collaboration,” Isis 102, 3 (Sept. 2011), 494–505.

Written by youryblog

December 31, 2014 at 2:52 PM

Yosemite, iOS 8, Spotlight, and Privacy: What you need to know By Rene Ritchie, Monday, Oct 20, 2014 a 8:31 pm EDT

leave a comment »

According to Landon Fuller, who collected the data in the first place,
this is not just about Spotlight, and the data will continue to be
sent to Apple even if Spotlight Suggestions -- or any of a number of
other seemingly relevant system configuration options -- are disabled.


for the raw data and analysis, without either the Apple apologism of
iMore or the journalistic spin of the Washington Post article they

Of course it is in Apple's interest to say that they care about
security and privacy, to emphasize how much effort they put into
minimizing data (we've heard this one from James Clapper before!), and
to claim that their snooping serves to benefit users by providing more
accurate answers.  None of this changes the surveillance they have
built into their system or how difficult it is to avoid!

Yosemite, iOS 8, Spotlight, and Privacy: What you need to know
By Rene Ritchie, Monday, Oct 20, 2014 a 8:31 pm EDT

A story made the rounds earlier today calling into question the new Spotlight Suggestions feature in OS X Yosemite and iOS 8. In an effort to garner attention, it reports the collection and usage of the information required to enable this feature in a needlessly scary way. As any long time reader knows, security and privacy are always at odds with convenience, yet features like Spotlight Suggestions — and Siri before it — do an excellent job balancing as much convenience as possible with maintaining as much privacy and security as possible. Here’s Apple’s statement on the matter:

“We are absolutely committed to protecting our users’ privacy and have built privacy right into our products,” Apple told iMore. “For Spotlight Suggestions we minimize the amount of information sent to Apple. Apple doesn’t retain IP addresses from users’ devices. Spotlight blurs the location on the device so it never sends an exact location to Apple. Spotlight doesn’t use a persistent identifier, so a user’s search history can’t be created by Apple or anyone else. Apple devices only use a temporary anonymous session ID for a 15-minute period before the ID is discarded.

“We also worked closely with Microsoft to protect our users’ privacy. Apple forwards only commonly searched terms and only city-level location information to Bing. Microsoft does not store search queries or receive users’ IP addresses.

“You can also easily opt out of Spotlight Suggestions, Bing or Location Services for Spotlight.”

Here’s the original charge:

Apple has begun automatically collecting the locations of users and the queries they type when searching for files with the newest Mac operating system, a function that has provoked backlash for a company that portrays itself as a leader on privacy.

The “backlash” cited by the sensationalistic story is not the result of the story but the result of sensationalism, and that’s disappointing. We depend on major publications to provide us with accurate information for our benefit, not for their own benefit. Where they could have taken the time to look into it, assess the facts, and help people understand, they chose to double down on FUD, and that’s not only disappointing, it’s distressing.

So what are the facts? Apple discloses how Spotlight Suggestions work in both the Spotlight section of System Preferences on the Mac, and in the Spotlight section of Settings > General on iPhones and iPads.

There’s also a Spotlight Suggestion check box on both so that you, the person using the device, can easily turn it off if you value privacy and security over convenience. (And if you are such a person, and have already disabled location services, Spotlight honors that setting and doesn’t send the information.)

Apple links to the following text right from the prefs/settings pane on both OS X and iOS. Not only is it simple to find, it’s plainly written and understandable:

When you use Spotlight, your search queries, the Spotlight Suggestions you select, and related usage data will be sent to Apple. Search results found on your Mac will not be sent. If you have Location Services on your Mac turned on, when you make a search query to Spotlight the location of your Mac at that time will be sent to Apple. Searches for common words and phrases will be forwarded from Apple to Microsoft’s Bing search engine. These searches are not stored by Microsoft. Location, search queries, and usage information sent to Apple will be used by Apple only to make Spotlight Suggestions more relevant and to improve other Apple products and services.

If you do not want your Spotlight search queries and Spotlight Suggestions usage data sent to Apple, you can turn off Spotlight Suggestions. Simply deselect the checkboxes for both Spotlight Suggestions and Bing Web Searches in the Search Results tab in the Spotlight preference pane found within System Preferences on your Mac. If you turn off Spotlight Suggestions and Bing Web Searches, Spotlight will search the contents of only your Mac.

You can turn off Location Services for Spotlight Suggestions in the Privacy pane of System Preferences on your Mac by clicking on “Details” next to System Services and then deselecting “Spotlight Suggestions”. If you turn off Location Services on your Mac, your precise location will not be sent to Apple. To deliver relevant search suggestions, Apple may use the IP address of your Internet connection to approximate your location by matching it to a geographic region.

Apple has also posted a privacy section on their website, and an updated version of their iOS 8 security document that reiterate what they’re doing and their long-standing position on privacy. Here’s the relevant parts:

To make suggestions more relevant to users, Spotlight Suggestions includes user context and search feedback with search query requests sent to Apple.

Context sent with search requests provides Apple with: i) the device’s approximate location; ii) the device type (e.g., Mac, iPhone, iPad, or iPod); iii) the client app, which is either Spotlight or Safari; iv) the device’s default language and region settings; v) the three most recently used apps on the device; and vi) an anonymous session ID. All communication with the server is encrypted via HTTPS.

The white paper goes on to explain how locations are blurred, anonymous IDs are only kept for 15 minutes, recent apps are only included if they’re on a white list of popular apps, etc. (It starts on page 40 of the above-linked PDF if you’re curious about the specifics.)

So, again, Apple is only doing what they need to do to provide the conveniences of the feature they announced — the same way they’ve needed to collect enough data to answer questions with Siri in the past, or show you locations on Maps, or find your iPhone, iPad or Mac, and the list goes on.

If you don’t like or want it, you can turn it off. That’s the real story here — education. How it works, and what you can do with it and about it.

If you have any concerns or questions about Spotlight Suggestions, let me know in the comments!

Written by youryblog

October 24, 2014 at 2:17 PM