This Is Not Simone de Beauvoir

an (non) Actor’s Lament

“He stole my scarab beetle!” I mumbled.

“No, no, no. Again, stronger please.” Each shake of Professor C’s shaggy head was rife with that combination of exasperation and despair so particular to the French.

“He STOLE my scarab beetle!”

Professor C sighed. I was supposed to be furious, yes? So, where was my fury?

Oh, my fury was there, all right, and, like usual, it was tied up in constricting knots.

I had arrived in Paris a month earlier, when the city was a green-gold oz of languid picnics and easy laughter and insouciant summer scarves. Now the ginko leaves were falling, and I was falling too, deeply disquieted by my inability to summon a six year-old’s roiling, aggrieved rage.

When I enrolled in “Acting French,” it had been in the manner of an empty nester signing up for pottery classes: a “why not?” threaded with a thin suspicion that I might actually be good at this. Apart from a successful turn as an angry Loyalist in our fifth grade class play, there was no historical basis for that suspicion — but I’m a card-carrying millennial; at 19, I believed I could be good at (almost) anything.

Right away, there were warning signs. In my naiveté, I had imagined that we would spend most of class reading aloud from classic French plays, like a sort of extended adult story hour. Maybe I would finally understand why people liked Les Miserables so much, I figured. But Professor C, sprightly in pirate boots and a military jacket, informed us that the play we would be workshopping all fall was one of her own. At the end of the semester, we would get to perform it at the American Embassy. What a treat for us all!

The play was based on the letters of Simone de Beauvoir. Simone was a complicated woman; Professor C acknowledged this by chopping up her life into eight different roles. A sylvan-voiced, doe-eyed Missourian, the sole theatre student in the class, got the role of twenty-something Simone, teenage Simone went to a very stylish Teen Vogue intern, and child Simone went to me.

Ninety percent of my role was a monologue concerning the whereabouts of the aforementioned beetle. Like a good budding philosophe, I exhausted all possibilities; then, I hid in some bushes. I always rushed through my monologue, hoping to reach the bushes before Professor C interrupted me. Unfortunately, the bushes were invisible (the professor was a big fan of the Our Town aesthetic, or else there was no money for props), so even when I escaped sans remark, my flaming cheeks were on full view.   

It was, objectively, a terrible play, and I, just as objectively, was terrible in it. My terribleness compounded, which confounded me, and then it flatlined, which confounded me more. I was accustomed to linear progression, but as October slid into November, it became clear to me that my only hope was an exponential hail Mary.

In Mid-December, we bundled the bony stools that constituted the entirety of our set into a van and set off for the Embassy. “Don’t come,” I instructed my boyfriend, but when I scrambled, fizzy with nerves, atop my stool, he was there, partially blocking the view of an ambassador. In the dressing room right before we went on, I had swallowed a glug of tequila, and as I waited for Professor C to finish introducing the play, I realized that the nerves I was feeling were the good kind, the kind I used to get as a swimmer behind the starting blocks. “Oh hey, this is going to be fine,” I thought, and it was. I tore into my monologue with the requisite fervor; for the first time, my lines didn’t feel like lines anymore, but like skin, something I owned. Quite suddenly it was all over and I was blushing less fiercely than usual and I could hear the audience laughing, just has they had when I was a bonneted Loyalist.

At the party afterwards, Professor C congratulated me and insisted that I finish one of the nicer bottles of champagne that was circulating. “You’ve earned it,” she said. But exponential Hail Marys, however welcome they may be, do not really give one the sense of having “earned it.”

The following fall, I received an email. Someone had made a documentary about Simone and our Embassy performance was part of it. There would be a viewing party at NYU’s Maison Francaise, in the Washington Mews. I was free that evening, but I did not go. I have never tried to act in anything since.

 

This Is Not Simone de Beauvoir

The Thrashers (thoughts on Emily Dickinson and solitude in the age of Trump)

I went to see the new Emily Dickinson exhibit at the Morgan this morning. I’ve liked Emily Dickinson since I was a child, without knowing more than the most famous of her poems. I read and reread Michael Bedard and Barbara Cooney’s picture book about her, and the little girl who spotted the train of her black dress once, on the stairs. Emily wrote the girl a poem, I remember, and in return, the girl gave her paperwhites, bulbs that harbinger spring in winter, indoors, the perfect flower for a recluse.

growing-paperwhites

The book didn’t make clear why Emily felt the need to retreat from a world that was already, relatively speaking, a retreat, and the exhibit didn’t either. Emily, the daughter of one of the treasurers of Amherst College, received the sort of education typically afforded only to wealthy young men. The exhibit contained yearbooks, programs for plays and “Eclectic Societies,” daguerreotypes of the prominent scholars and publishers and ministers whose lives Emily’s intersected with. She grew up in a cultural hub, even if the culture heavily tinged with a religion she could never bring herself to subscribe to. She had large, dark eyes and auburn hair she once prized enough to send her best friend a lock of; she was, in many ways, a catch.

She was a catch but she got caught in herself; by her mid-twenties, she had begun to withdraw, to curl up like a snail in her room, going outside only to garden. And yet, measured in words, her life was full indeed: over its course, she wrote 1800 poems, half of them between 1858 and 1865. Only 10 were published in her lifetime; all anonymously, and most without her say-so. Most of them existed in only one or two places, jotted in pencil on sheets of paper she first sewed into fascicles and later folded together in simpler sets. Some she sent to friends, or admired acquaintances; the majority were known only to her until after her death.

emily_dickinson_fascile

Like e.e. cummings, Emily saw poetry in multiple dimensions, played with capitalization, with alignment, with the shape of the paper beneath the words. She loved em-dashes; some of her poems seems studded with sharp intakes of breath. Shad her own editing notation: “N” for nature, “D” for death, crosses for words with alternates. She almost never used titles. Because of this and because so many of her poems lived in letters, and because her literal voice was so swoonily akin to her poetic one, her posthumous editors sometimes had to guess where the letter ended and the poem began. They may have guessed wrong, but if the world is your shellacking oyster, does it make a difference?

screen-shot-2017-02-12-at-10-57-43-am

In recent months, I’ve been doing some shellacking of my own, walling myself off from flesh and bone connections. Conversely, I’ve been texting furiously. Like Emily and the legions of IRC-ers and Redditors and 400 pound men who live in their parental basements, I’ve (re)discovered the comfort of a world of my own making, bolstered by communication whose direction I control. I drink and my texts grow expansive; I make plans I shy away from come shriveled morning.

It’s easy to blame the election for this, and much of it is reactionary: with so much to do, and so much to be angry about, I do nothing and seethe, or the easiest things, while dreaming of doing everything, and loudly. But Emily was at her most productive at the height of the Civil War; in solitude, she produced; I tend to reduce, or flail.

hopper-automat.jpg

This post doesn’t have an end; I’m not even sure what I’m trying to say, exactly. Is solitude justified if it’s productive? Can I spend my days consorting with flowers so long as I write about them in ways that contain at least a possibility of politically aware metaphor? Unless you’re a genius, that smacks of privilege. The day I felt least shellacked was the day of the Women’s March, in Washington. There was joy that day, in walking among so many people who didn’t just feel or dream but did. It helped that there was no cell service; the only place I could be was exactly where I was.

 

The Thrashers (thoughts on Emily Dickinson and solitude in the age of Trump)

The Gray Mare Is the Better Horse

nyc_blackout_77

Yesterday, it was beautiful in New York, unseasonably warm, but with November’s brand of curling sun. Perfect weather for waiting in a long line to cast a vote and then walking around with a multi-pronged, fuck-yeah ebullience. In email chains and Facebook posts and Instagram comments, there were giddy, reclamatory uses of  “nasty” and “badass,” photos of suffragettes’ sticker-coated headstones, and so many variations of #withher. We were poised on the precipice of righteousness; we were buoyant and thirsty; we were ready to revel.

garbage-burning-in-nyc-streets-1977

At the DNC this summer (how long ago that feels, now), there was a sense that patriotism, long the far right’s excuse for acts of prejudice and vitriol, might become a rallying cry for unity on the left. Both of the Obamas’ speeches made me proud to be an American, a pride that was kicked along by the Olympics and reached its apex about 30 minutes in to the first presidential debate.  Whatever pluckings of disbelief I felt at the polls being as close as they were (for even a margin of 20 percentage points felt close, given what was on the fuzzy end of that margin) were stayed by a confidence that Hillary would not, could not lose.

I live in a bubble. I’ve known that for years, but I didn’t really understand it until last night, when the New York Times forecast gauge slipped from blue to pink to red. Somewhere around the time Florida was called for Trump, that bubble burst.

ap7707140492

Today is not blue or pink or red. It is still warm, but not sunny and there are no frissons of ebullience to be had. Today is grey and the people I pass on the streets are grey and the man on the subway with his head in his hands is grey and I myself feel grey in a way I have never felt before. Instinct tells me to flee – not out of fear but out of anger and spite and a refusal to own our new reality. Braver to stay and fight – but how, with no house and no senate and the certainty of a conservative court and the spectre of alt-right death-eaters making preparations for their February feast?

One way, the way that seems most obvious, is to use the free tools that Trump employed so effectively against him, to combat his lies with truths, to share the fears of the frightened and disenfranchised so that they might be allayed, to keep insisting that patriotism is love and not hate. If we rage enough against the dying of the light, we just might revive it.

09-nyc-blackout-w1373-h913

The Gray Mare Is the Better Horse

Liminal States

Recently, I came across a blog my college roomates and I had kept throughout the last semester of our senior year. It was, largely, a cooking blog, and small-ly, a walking blog, because I was not at a stage where I could give anybody advice about cooking that didn’t start and end with a microwave. I read through the archives, which started with my and my roommate Emily’s joint birthday party and ended with a walk I’d done from my then-office in Times Square home to our apartment in Cobble Hill. The penultimate post, also by Emily, struck a similar chord to me as it had when I’d first read it, at 22. It was written shortly before our graduation, and, in addition to oatmeal cookies, it offers up some really f**king poignant end-of-an-era-now-what??? feelings.

The other night the kitchen was cluttered, it was filled with canned goods and rice and dirty dishes and my computer and shredded coconut and the millions of water glasses I am currently going through.  The next morning it was a little better, as all cluttered nights feel better after you sleep on it. And I think as the week goes on, it will most likely stay on the cluttered side, but I kind of like that. Life is never a shiny, clean kitchen. Especially when you’re graduating.

Emily titled the post “The State that I am in,” and the short answer was a liminal one.

The phrase “liminal state” has been clanging around my head all summer. The word “liminal” comes from the Latin limen, “threshold.” It’s a very pretty word, I think — it sounds like a skip. Nowadays, I don’t know if there is any word with as high a ratio of metaphoric:literal as threshold, though back in Roman times, maybe that wasn’t the case.

the-upside-down

I have my own liminal state, but I’m seeing them everywhere. You could call the Upsidedown in the Netflix series Stranger Things an alternate world but I’d call it a liminal one, attached to the rightsideup like Peter Pan’s stockings. I fell asleep and dreamed our bedroom had grown mossy cumuli and a suspended haze.

Last night, I rewatched Joe Swanberg’s Drinking Buddies, which is such a lovely, unshowy little movie, in the vein of Before Sunrise, about the sort of friendship that can become something else. There are so many scenes in Drinking Buddies where the central teeter, in words but often in gestures, on the brink of that something else; they burst with liminality.

drinking-buddies

A liminal state is a bit of a Shrodinger’s cat: neither one thing nor the other, or both, or their bleeding, shrouded edge. It is not necessarily a bad state, though it can feel so — but then again it can also feel deliciously close to a desired outcome, without all of the responsibility of owning its results.

In a liminal state, we are none of us terminal cases.

 

Liminal States

Cruise Control, or Why Katie Ledecky Dominates Distance Freestyle

I began paying attention to Katie Ledecky, the American distance swimmer, after Brian Phillips wrote a fantastic profile on her for Grantland, in the fall of 2014. Ledecky was seventeen then; she had been the reigning world record-holder in the women’s 800 meter freestyle for over 2 years. Actually, reigning, which suggests stasis, isn’t the best gerund for Ledecky, who, after whittling nine seconds from her prelims time to seize the gold at the 2012 Olympics*, kept whittling. 8 minutes, 14 seconds. 8 minutes, 13 seconds. 8 minutes, 11 seconds. Since Phillips wrote his piece, Ledecky has continued to get faster and faster; in addition to the 800, she now holds world records in the 400 and the mile — the latter by 17(!!!) seconds.It’s in the mile, as Phillips noted, that Ledecky laps her opponents**. At the elite level, this is, let’s just say, unusual. More unusual: holding world records in both mid-distance and distance events (and in Rio, she’ll be racing the 200 — nearly a sprint!– as well).

ledecky_world_record
Via the New York Times

In 2014, Phillips was one of the few non-niche sportswriters to cover Ledecky. These days, with Rio upon us and Ledecky our best, surest chance at Olympic aqua-Gold, features abound. Most of them hammer home the following two points: 1) Katie Ledecky is “the greatest athlete in America”/”peerless”,”better at swimming than anyone is at anything”, and 2) Katie Ledecky is really nice. While I LOVE seeing swimming get some mainstream lip service, calling someone unbeatable without offering more than naive straws as to why they are unbeatable is boring, and calling an athlete “nice” is just lazy. Basically, some of these pieces come off like pre-Kimye T-Swift features, is what I’m saying.

Nonetheless, Katie Ledecky, while perhaps not “better at swimming than anyone is at anything,” is going into Rio with a wider berth between her and her competitors than maybe any other athlete. There are a few explanations for this, some to do entirely with Ledecky herself, and some to do with the mechanics of distance swimming.

First, timing. At 11, Ledecky was still a multisport athlete; two years later, she and her then coach, Yuri Suguiyama, were mapping out plans for London. This might seem like a crazy leap, but it is fairly common for previously casual swimmers go through a period of staggering improvement when they hit adolescence.***  For some, this period may be short and followed by a plateau before improvement continues, but for others, like Ledecky, the period can last for years with little abatement.

Second, margins. The press, not particularly hip to the vagaries of distance swimming, are astounded by the number of seconds Ledecky can drop in an 800 or 1500 between meets. For someone on a spate of improvement, dropping two seconds in the 800 over a month is like dropping maybe a tenth of a second in the 100. Think of it as less of a sprint, more of, I dunno, a half-marathon. Even at the elite level, there’s some cushion, especially before you figure out how you want to swim it. (The length of the race is also why Ledecky is able to establish multi-second leads: being a little faster over the course of a 50 multiplies to being a lot faster over the course of a 800 or a 1500). And these margins are why I think distance swimming is so fun to watch: unlike the 50 or the 100, which are decided almost at the dive, or even the 200, the distance events offer a lot of uncertainty — not only in who will win, but how?

I was never a true distance swimmer, so I’m not the most suited to discuss distance race strategies. That being said, there are three common ones:

  • Go out fast and try to hold on. (When the place matters more than the time, this can be a decent strategy for messing with your competitors’ head games, at least.)
  • Go out easy and build. (If you’re not as susceptible to head games and you feed off pressure, this strategy can pay off. Bonus: it’s more exciting to watch.)
  • Stay a maintainable speed for pretty much the entire race, save for the first and last fifty.

The last method is the trickiest to manage, because the swimmer needs to know exactly what her maintainable speed is. At the elite level, this method also requires the most endurance and raw strength, because the swimmers are holding nearly race pace for over eight minutes. Ledecky follows this strategy, as do some of the elites who she’ll most likely be swimming against in the prelims and finals at Rio.

Below are the splits for each successive 50m in the 800 for the world’s fastest swimmers in this event, using their best times of the past two years.****

image (57)

Ledecky swims a nearly perfect steppe: fast out, steady on, fast close. For the middle 700, the standard deviation between her splits is 0.12 seconds. The average standard deviation for the other elites listed here is 0.27 seconds.

The New Zealand swimmer Lauren Boyle, the German Sarah Kohler, and Ledecky’s teammate Leah Smith swim their middle 700 somewhat evenly (stdevs of .18, .27, .27, respectively), but they don’t close like Ledecky, whose final 50 is within half a second of her first 50. Only Jessica Ashwood (AUS) has a final 50 that is closer to her starting 50 than Ledecky, but Ashwood’s strategy is more of a come-from-behind than any other in this chart.

What this all boils down to — and the reason Ledecky dominates this event — is that she is able to swim fifteen 50s at a pace that is, on average, a half second faster than what any other female swimmer can do.

Screen Shot 2016-08-01 at 10.18.16 PM

This isn’t a predictions post, as, barring deus ex machina, Ledecky, whose fastest time this year is 12 seconds above that of Jazz Carlin, the runner-up, will win this handily. That said, looking at the data, I’d say that Carlin and Jessica Ashwood might have the best chance at coming within ~7 seconds of Ledecky. Not because their swims are steppe, but because of the opposite: they have relatively weaker middles, and the middle is the most straightforward target for time whittling.

 

 

*Wresting crowns from the Brits is always a bit of fun, especially when it’s done under the gaze of the king and queen.

**Though it’s unlikely that you’ll get to see her lap anyone in Rio, as women, ridiculously, are forbidden from racing the mile.

***My own childhood coach attributed this phenomenon, in part, to not knowing enough to overthink, and I see echoes of this in many a profile on top American swimmers.

****Missing Zhang Yuhan, of China, as I was not able to find her splits.

Cruise Control, or Why Katie Ledecky Dominates Distance Freestyle

Do the Jitterbug (an analysis of cold brew efficacy using D3.js)

Screen Shot 2016-07-24 at 9.09.47 PM

I write this sitting in the kitchen sink… Just kidding! My kitchen sink barely holds one wriggling (fat) pug. In truth, I write this melting, Elmira-style, in the peak of #heatdome2016. Temperatures like these (mid 90’s, egads!), merit rewards merely for enduring them, and what is a five dollar vat of caffeine if not a reward?

There are three coffee shops within 1-2 blocks of my apartment, and another handful scattered round my office. I buy cold brew most mornings, but where I buy it depends on mood, lateness, grogginess, general aesthetic preference, etc. In my head, they all cost around $4, but I was vaguely curious to see how the price stacked up versus size, and, more importantly, jitter quotient. Call me a New Yorker (it’s been eight years — I’m close!), but I prefer to let the ice cubes do all the dilution.

I created the following visualization by modifying Michele Weigle’s scatterplot so that it rendered images instead of circles. Specifically, I replaced

 svg.selectAll(".dot")
      .data(data)
    .enter().append("circle")
      .attr("class", "dot")
      .attr("r", 3.5)
      .attr("cx", xMap)
      .attr("cy", yMap)
      .style("fill", function(d) { return color(cValue(d));}) 
      .on("mouseover", function(d) {
          tooltip.transition()
               .duration(200)
               .style("opacity", .9);
          tooltip.html(d["Cereal Name"] + "<br/> (" + xValue(d) 
	        + ", " + yValue(d) + ")")
               .style("left", (d3.event.pageX + 5) + "px")
               .style("top", (d3.event.pageY - 28) + "px");
      })
      .on("mouseout", function(d) {
          tooltip.transition()
               .duration(500)
               .style("opacity", 0);
      });

with:

 svg.selectAll(".dot")
 .data(data)
 .enter().append("image")
 .attr('xlink:href',function(d){
 if (d.neighborhood =='Fidi'){
 return 'icedcoffee_blue.png'
 }
 else { 
 return ('icedcoffee_purple.png')
 }
 })
 .attr('class', 'dot')
 .attr('x', xMap)
 .attr('y', yMap)
 .attr('height', '50')
 .attr('width', '50')
 .on("mouseover", function(d) {
 tooltip.transition()
 .duration(200)
 .style("opacity", .9);
 tooltip.html("jitter-factor: "+ xValue(d)
 + ", " + "price/oz: $"+ yValue(d))
 .style("left", (d3.event.pageX + 5) + "px")
 .style("top", (d3.event.pageY - 28) + "px");
 })
 .on("mouseout", function(d) {
 tooltip.transition()
 .duration(500)
 .style("opacity", 0);
 });

The resulting viz indicates that our own cafeteria gives by far the best jitter bang for buck, while The Wooly Daily ought to be avoided (though the vibes are decent). Also, Fort Greene caffeine tends to be both stronger and pricier than its FiDi counterpart.

The code for this viz is up on github. Feel free to plug your own coffee places into the jitterbug.csv file and see how they shake up!

Do the Jitterbug (an analysis of cold brew efficacy using D3.js)

So You Want to Be a Data Analyst, Part 4:Visualizing Your Data with D3js

A picture is worth 1,000 words, or so throw pillows tell me. A good data visualization can be worth millions of rows (or thousands, or billions, even, depending on the size of the data you’re illustrating). Conversely, a bad data visualization, put into the wrong hands, can be just as harmful as a photoshopped Sandy doomsday photo. The previous two posts in this series have addressed acquiring and making sense of your data; today, I’ll take you through the process of turning that data into a visual story using the D3js library. First, though, I want to cover the reasons why you’d want or need to visualize your data, and explain when it makes sense to create the visualization programmatically instead of using a ready-made chart builder.

Talk Graphics to Me

A graphic/viz/chart is to data what symbols are to words: a succinct means of conveying a large amount of information. Like symbols, visualizations tend to be more eye-catching than the data they encapsulate. Use them when you’re reporting on data that has more than one facet (eg quantities over time, or quantities by category, or quanities by category over time).

Likely, you’ve created data visualizations before, in our old pal MS Excel (or, if you’re like me and LOATHE Excel, Google Spreadsheets). Common viz include bar charts (horizontal and vertical — good for showing a breakdown of quantities by category), line charts (good for showing change in one or more categories over time), scatter plots (good for showing the relationship between two quantities), and pie charts (good for nothing. JK — they’re good for showing breakdowns of quantities by category, provided you have a small number of categories).

If your data is already in a spreadsheet, using a built-in chart to summarize it in one of the ways discussed above is totally fine. I use Google Spreadsheets’ built-in charts often — they have a decent amount of customization, and the added bonus of being embeddable. So when should you use something else?

Taking Measures into Your Own Hands

  1. The first reason you might want to create your own data visualization relates to what I said about Google Spreadsheets being embeddable: shareability. No one should ever have to open an Excel attachment (or any attachment) to view a chart. You can insert your charts in the body of your email, but it’s nice, I think, to have a backup, especially if that backup is interactive (so really, the inserted image is the sneak preview). Also, if you have a data blog or some sort of internal company site, you can slap your custom-made data visualization in there and–provided you’ve styled it right–not have to worry about it breaking on your CEO’s iPad.
  2. The second reason to DIY is customization. Your dataset might not fit in Excel’s default charts, or maybe it does, but you want to show it off in something nicer. A little couture never hurt anyone. D3 shines here — once you get good at it, you can visualize so many data structures (for some great examples, check out Kantar’s Information Is Beautiful 2015 finalists, or the New York TimesUpshot blog).
  3. The third reason is that your data isn’t in a spreadsheet, and you don’t want to export it into one. This is the dashboard scenario: hook your visualization straight into your database, schedule your query so that the data refreshes, and let ‘er rip.
  4. The fourth reason is replicability. Chances are, you’re going to be reporting on the same or similar data (or data with the same or similar structure) more than once. If you create your visualization layer programmatically, you can easily swap in new data for the old data, et voila. To create the D3 example below, I modified a script I’d already created for the Met Gala. The original took me well over an hour to create. This version took maybe ten minutes.

<Caveat Lector>

People not on the creation side of visualizations tend to trust the result implicitly, and this trust can be deliberately or mistakenly abused. The best way to prevent bad visualizations is to understand your data, and its conclusions, before you translate them. Where possible, you should provide your dataset along with your visualization, so that your audience can a) pursue their own deeper understanding of the data and b) validate your work.

</Caveat>

Time to Bind

D3 is certainly not the only javascript library designed to bind data to DOM, but it’s one of the best (IMHO), buzziest (indisputable), and most-used. It was created by Mike Bostock, back when he was the graphics editor at the New York Times. Remember when the NYT started gathering acclaim for its visual articles like “512 Paths to the White House?” That was Bostock’s doing (though the NYT as a whole deserves a lot of kudos for supporting a very different means of information display long before anyone else).

I’m not going to spend too much time explaining D3 upfront, because others, including Bostock himself, have done so extensively. Here’s what you need to know before we get graphing:

  • D3.js is a javascript library that allows you to attach data to the DOM of a web page.
  • Once attached, that data can be manipulated by manipulating the DOM elements its bound too.

Okay’s let’s get crackin. Today, you’re going to create a bubble graph that shows the wax and wane of music genres’ popularity over time. The dataset you’ll be using contains information about music genre popularity in the United States between 1960 and 2010. The popularity of a given genre is simply a sum of the times it appeared in the Billboard Hot 100 in a given year. This dataset was created by aggregating Matthias Mulch’s song-level data, which he used in “Evolution of Popular Music: USA 1960-2010.” You can download the data, as well as all of the visualization code, on Github.

Step 1: Get your data.

Honestly, this can be the hardest part, depending on how you approach D3, and also what kind of data you’re working with. In an ideal world, you already have access to the data you need, and you load it, untouched, into D3 and start manipulating. If replicability matters to you and/or you’re building a dashboard, you’re going to want to do as much of the manipulation directly in D3 as possible. That being said, when you’re starting out, you’re probably going to find yourself doing the following:

  1. Deciding, from the shape of your result set, which visualization(s) would best explain it.
  2. Finding an example D3 visualization online.
  3. Realizing that the data it relies upon is either hard-coded or shaped differently from yours.
  4. Retrofitting your data to resemble the example data.

This is okay, in the beginning. As you learn D3, you’ll become more comfortable with transforming your data into the shape you need in D3, using D3’s built-in nest and mapping functions. Today, however, you’re going to do a minimal amount of manipulation in D3 because the music data you’ll be using is already in the necessary format.

Let’s look at the head of the data:

“year”,”genre”,”score”
1960,”ALTERNATIVE”,1
1961,”ALTERNATIVE”,0
1962,”ALTERNATIVE”,0

Big surprise — there was not much alternative music in 1960! Shall we check out the tail?
2004,”SOUL”,23
2005,”SOUL”,21
2006,”SOUL”,25

See, Dad — soul didn’t die in the seventies! (That would be Motown.)

The point of all this inspecting is to know the shape of your data: three columns, one a date, one a category, and one an integer. As you start doing more D3, you’ll want to compare your data to the data in the examples. One way or another, yours will need to match the example shape before you start drawing your visualization.

Step 2: Set up your html, css, and D3 scaffold.

By which I mean, a basic html web page with a <body> tag to bind to, css to dictate any over-arching styles, and the start of your javascript, which will contain your visualization’s width, height, and margins. Typically, you’ll break these out into their own files, but for brevity’s sake here, I’m going to put everything into the HTML.

<!DOCTYPE html>
 <meta charset="utf-8">
 <style>

body {
 font: 12px sans-serif;
 }

.axis path,
 .axis line {
 fill: none;
 stroke: grey;
 shape-rendering: crispEdges;
 }

.dot {
 stroke: none;
 fill: steelblue;
 }

.grid .tick {display: none;}

.grid path {
 stroke-width: 0;
 }

div.tooltip {
 position: absolute;
 text-align: center;
 width: 80px;
 height: 42px;
 padding: 2px;
 font: 12px sans-serif;
 background: #ddd;
 border: solid 0px #aaa;
 border-radius: 8px;
 pointer-events: none;
 }

</style>
 <body>
 http://d3js.org/d3.v3.min.js


 //We kick off any d3 javscript with basic sizing
 var margin = {top: 100, right: 20, bottom: 30, left: 100},
 width = window.innerWidth - margin.left - margin.right,
 height = 500- margin.top - margin.bottom;

Step 3: Add your data stipulations.

By stipulations, I mean the specific properties of your dataset. Since this dataset contains time data, you’ll want a parse date function that tells D3 “hey, this column is a date and it’s in this format.” You might want a format date function so that you can control what the date displayed on your axis looks like. We’re going to be drawing a categorical bubble chart on a time-series, so you’ll need a time-scaled X axis and an ordinal (non-linear/aka non-quantitative) Y axis. You’ll also want a color scale to differentiate between the genres. All these javascript stipulations will look like this:

//if we have date data, we need to tell d3 how to read it.
 //in the music data case, all we have is the year
 var parsedate = d3.time.format("%Y").parse;
 var formatDay_Time = d3.time.format("%Y");

//we're going to want to color our scores based on the genre. Here, we're just setting a color variable.
 var color = d3.scale.category10();
 //We need to tell d3 what kind of axes we want.
 //In the music data case, we want our dates on the x axis and our genres on the y.
 //Because the genres are categorical, we will explicitly tell d3 how far apart to space them, using rangeRoundBands
 var x = d3.time.scale().range([0, width]);
 var y = d3.scale.ordinal()
 .rangeRoundBands([0, height], .1);

//tell d3 to orient the y axis on the lefthand side of the graph
 var yAxis = d3.svg.axis()
 .scale(y)
 .outerTickSize(0)
 .innerTickSize(0)
 .orient("left");

//put the x axis on the bottom
 var xAxis = d3.svg.axis()
 .scale(x)
 .orient("bottom");

Step 4: Create a visualization object and attach it to the DOM.

The visualization object is called an SVG, aka “scaleable vector graphic.” This is the thing onto which your data will attach and come to life. Declare and attach it to your <body> tag like this:

var svg = d3.select("body")
 .append("svg")
 .attr("width", width + margin.left + margin.right)
 .attr("height", height + margin.top + margin.bottom)
 .append("g")
 .attr("transform", "translate(" + margin.left + "," + margin.top + ")");

Step 5: Load your data and attach it to your SVG.

You can hard code your data, but that’s super onerous and also not replicable, so I’d advise against it. Instead, use one of D3’s loading methods. In our case, we’ll use d3.csv, because our data is in a csv format. Once we load it, we loop through it and format it correctly.

d3.csv("data/history_of_music.csv", function(error, data) {
 data.forEach(function(d) {
 d.year = parsedate(d.year);
 d.score = +d.score;
 });

 

Step 6: Define/create your axes

 

The domain methods tell D3 which data should be attached to your axes. In our case, the d.year data goes to the x axis,the d.genre goes to the y axis, and the d.score dictates the radii of the bubbles. Defining the x axis is easy:

// Set our x axis domain with the earliest and latest years
 x.domain([new Date(1959, 01, 01), new Date(2009, 12, 01)]);

To define the y axis, we want the unique genre names. We can get them by looping through all of the genres and appending each new one to an array.

//Now we'll create an array of unique genres, which we'll use to create our y axis
 //to do this, we just loop through the genres data and append each name
 //that doesn't already exist in the array
 genres_filtered = []

data.forEach(function (d) {
 if (genres_filtered.indexOf(d.genre) === -1) {
 genres_filtered.push(d.genre);
 }
 });

To define the r radius, we use the maximum score divided by the number of genres, so that no bubble will be too big.

//Now we set the radius to be between zero and the height divided by the count of unique genres
 var r = d3.scale.linear().range([0, height / genres_filtered.length]);
 r.domain([0, d3.max(data, function(d) { return d.score})]);

Last, we'll append our x and y axis to the SVG using the call function. 

//actually append our y axis to the svg 
 svg.append("g")
 .attr("class", "y axis")
 .call(yAxis)
 .selectAll("text")
 .attr("dx", 0)
 .attr("dy",-5)
 .style("text-anchor", "end");

//append our xaxis to the svg
 svg.append("g")
 .attr("class", "x axis")
 .attr("transform", "translate(0," + height + ")")
 .call(xAxis);

Step 7: Draw the bubbles.

Here’s the fun part! To draw the bubbles, we’ll use the year and genre to place them on the axes, the score to size them, and genre again to color them. We’ll also add a little mouseout action to display the genre and sum of songs for that year.

// attach our tooltip
var div = d3.select("body").append("div")
 .attr("class", "tooltip")
 .style("opacity", 1e-6);
//draw our circles based on the scores, and attach them to the svg
svg.selectAll(".dot")
 .data(data)
.enter().append("circle")
 .attr("class", "dot")
 .attr("r", function(d) { 
 return r(d.score); 
 })
 .style("opacity", 0.25)
 .attr("cx", function(d) { return x(d.year); })
 .attr("cy", function(d) { return y(d.genre); })
 .style('fill', function (d) { return color(d.genre); })
 .on("mouseover", function(d) {
 div.transition()
 .duration(200)
 .style("opacity", .7);
 div .html(
 formatDay_Time(d.year) + "<br/>" +
 d.genre + "<br/>" + 
 d.score) 
 .style("left", (d3.event.pageX) + "px")
 .style("top", (d3.event.pageY - 42) + "px");
 }) 
 .on("mouseout", function(d) {
 div.transition()
 .duration(500)
 .style("opacity", 1e-6);
 });

Lastly, we’ll append a title.

svg.append("text")
 .attr("x", (width / 2)) 
 .attr("y", 0 - (margin.top / 2)) 
 .attr("text-anchor", "middle") 
 .style("font-size", "16px") 
 .text("Are We Human or Are We Dancer: the Rise and Fall of Musical Genres, 1960 - 2009, per Volume of Billboard Hot 100 Hits");

});

The full page with the script will look like this:

<!DOCTYPE html>
<meta charset="utf-8">
<style>

body {
 font: 12px sans-serif;
}

.axis path,
.axis line {
 fill: none;
 stroke: grey;
 shape-rendering: crispEdges;
}

.dot {
 stroke: none;
 fill: steelblue;
}

.grid .tick {display: none;}

.grid path {
 stroke-width: 0;
}

div.tooltip {
 position: absolute;
 text-align: center;
 width: 80px;
 height: 42px;
 padding: 2px;
 font: 12px sans-serif;
 background: #ddd;
 border: solid 0px #aaa;
 border-radius: 8px;
 pointer-events: none;
}

</style>
<body>
http://d3js.org/d3.v3.min.js


//We kick off any d3 javscript with basic sizing
var margin = {top: 100, right: 20, bottom: 30, left: 100},
 width = window.innerWidth - margin.left - margin.right,
 height = 500- margin.top - margin.bottom;

//if we have date data, we need to tell d3 how to read it. 
//in the music data case, all we have is the year 
var parsedate = d3.time.format("%Y").parse;
var formatDay_Time = d3.time.format("%Y");

//we're going to want to color our scores based on the genre. Here, we're just setting a color variable. 
var color = d3.scale.category10();


//We need to tell d3 what kind of axes we want.
//In the music data case, we want our dates on the x axis and our genres on the y.
//Because the genres are categorical, we will explicitly tell d3 how far apart to space them, using rangeRoundBands
var x = d3.time.scale().range([0, width]);
var y = d3.scale.ordinal()
 .rangeRoundBands([0, height], .1);

//tell d3 to orient the y axis on the lefthand side of the graph
var yAxis = d3.svg.axis()
 .scale(y)
 .outerTickSize(0)
 .innerTickSize(0)
 .orient("left");

//put the x axis on the bottom
var xAxis = d3.svg.axis()
 .scale(x)
 .orient("bottom");

//here we create our graph object, attach it to the body element of our html
//and append the sizing attributes we specified earlier
var svg = d3.select("body")
 .append("svg")
 .attr("width", width + margin.left + margin.right)
 .attr("height", height + margin.top + margin.bottom)
 .append("g")
 .attr("transform", "translate(" + margin.left + "," + margin.top + ")");


//Now, we read in our music data.
//We'll parse our dates and tell d3 that our scores are numeric 
d3.csv("data/history_of_music.csv", function(error, data) {
 data.forEach(function(d) {
 d.year = parsedate(d.year);
 d.score = +d.score;
 });

//check to make sure the data was read in correctly
 console.log(data);

// Set our x axis domain with the earliest and latest years
 x.domain([new Date(1959, 01, 01), new Date(2009, 12, 01)]);

//Now we'll create an array of unique genres, which we'll use to create our y axis
 //to do this, we just loop through the genres data and append each name
 //that doesn't already exist in the array
 genres_filtered = []

data.forEach(function (d) {
 if (genres_filtered.indexOf(d.genre) === -1) {
 genres_filtered.push(d.genre);
 }
 });


 //Now we set the radius to be between zero and the height divided by the count of unique genres
 var r = d3.scale.linear().range([0, height / genres_filtered.length]);
 r.domain([0, d3.max(data, function(d) { return d.score})]);


 //add the genre names to the y axis
 y.domain(genres_filtered);

//color our bubbles based on the genre names
 color.domain(genres_filtered);

//actually append our y axis to the svg 
 svg.append("g")
 .attr("class", "y axis")
 .call(yAxis)
 .selectAll("text")
 .attr("dx", 0)
 .attr("dy",-5)
 .style("text-anchor", "end");

//append our xaxis to the svg
 svg.append("g")
 .attr("class", "x axis")
 .attr("transform", "translate(0," + height + ")")
 .call(xAxis);

// attach our tooltip
var div = d3.select("body").append("div")
 .attr("class", "tooltip")
 .style("opacity", 1e-6);

//draw our circles based on the scores, and attach them to the svg
svg.selectAll(".dot")
 .data(data)
.enter().append("circle")
 .attr("class", "dot")
 .attr("r", function(d) { 
 return r(d.score); 
 })
 .style("opacity", 0.25)
 .attr("cx", function(d) { return x(d.year); })
 .attr("cy", function(d) { return y(d.genre); })
 .style('fill', function (d) { return color(d.genre); })
 .on("mouseover", function(d) {
 div.transition()
 .duration(200)
 .style("opacity", .7);
 div .html(
 formatDay_Time(d.year) + "
" +
 d.genre + "
" + 
 d.score) 
 .style("left", (d3.event.pageX) + "px")
 .style("top", (d3.event.pageY - 42) + "px");
 }) 
 .on("mouseout", function(d) {
 div.transition()
 .duration(500)
 .style("opacity", 1e-6);
 });

// Add the title
 svg.append("text")
 .attr("x", (width / 2)) 
 .attr("y", 0 - (margin.top / 2)) 
 .attr("text-anchor", "middle") 
 .style("font-size", "16px") 
 .text("Are We Human or Are We Dancer: the Rise and Fall of Musical Genres, 1960 - 2009, per Volume of Billboard Hot 100 Hits");

});

 

CONGRATULATIONS! A dataviz iz here! Though, if you’ve never built a web page with javascript before, you might be like…where? To view this guy, I recommend using the very lightweight SimpleHTTPServer. Just install it using your package manager of choice (I use pip), and make sure you’re in the same folder as your viz file when you launch the server. Once you’ve done that, you should see:

Screen Shot 2016-05-22 at 9.57.51 PM This post isn’t really about analyzing this particular dataset, but there are some trends that stand out to me. Namely, that pop peaked in the mid-80s, country didn’t pick up much mainstream steam until the late nineties, and rap has been waning as hiphop has risen. Also, I should say that these ten genres were the ten most popular genres out of a much larger genre set, so lots of the more interesting ones, like New Wave and Postpunk, are not represented here.

Anyways, I’m going to rap this up because it’s getting kinda Joycean in its length (though hopefully not in its parseability!). The code, including that for transforming Mauch’s data into the needed shape, is on Github. Have at it!

So You Want to Be a Data Analyst, Part 4:Visualizing Your Data with D3js