Category Archives: mathematica

Jewish New Year Distribution

I was talking to somebody the other day who was pleased because his birthday (September 9) landed on the Jewish New Year. He made the dubious claim that this happened something like 17% of the time.

For a quick check, I used the JewishNewYear[] function built in to Mathematica’s Calendar package. It is good from 1900 through 2100.


Needs["Calendar`"]

In[]:= Length[Select[
Map[JewishNewYear[#] &,
Range[1900, 2100]], #[[2]] == 9 && #[[3]] == 9 &]]

Out[]= 7

7/Length[Range[1900, 2100]] == .0348, so over this 201 year time span, his birthday would have landed on Rosh Hashana a bit under 3.5% of the time.

Curious about how the holiday distributed, I generated a table of frequencies (all displayed as though they were 2010 dates so we can use the built-in DateListPlot[] function).


In[]:= dateFreq =
Map[{#[[1]], Length[#]} &,
Gather[Sort[
Map[{2010, JewishNewYear[#][[2]], JewishNewYear[#][[3]]} &,
Range[1900, 2100]], AbsoluteTime[#1] <= AbsoluteTime[#2] &]]]; In[]:= DateListPlot[dateFreq, Filling -> Axis,
FillingStyle -> {Automatic, {Black, AbsoluteThickness[10]}},
GridLines -> None, Frame -> False, Axes -> True, AxesStyle -> Black,
AxesOrigin -> {Automatic, 0}, PlotRange -> All,
PlotLabel -> "Rosh Hashana Distribution"]

Note the nested use of Length[], Gather[], Sort[]. That is often a powerful combination when using Mathematica to analyze the frequencies of events or patterns in data.

In this case we do not have a uniform distribution — it clearly fades at the edges, but it’s not normally distributed either. Figuring out whether or not there is a meaningful pattern here is beyond my statistical skills.

reversing the x-axis in a Mathematica graph

Sometimes you have data in which the dependent variable decreases as the independent variable increases, and for reasons of clarity in illustrating it, you want to flip the x-axis and show an increasing line (or vice versa). For example, sometimes you have

and you want to display it as

Unfortunately, Mathematica’s default behavior is to show both axes growing as you go up and to the right, and there is no simple option to change that. You can achieve this however by transforming your data by hand so that it is displaying the desired way, and then setting the “Ticks” variable of the plotting function to display the Ticks in reverse order. You can’t just say “reverse,” you have to create a list of the new tick labels and where you want them to go. This is generally done with a call to the Table[] command.

I more commonly flip the X axis than the Y, and given data in {{x1,y1},{x2,y2} . . . {xn,yn}} format, the following code makes it very easy to do.

xFlipper[listofXYpairs_] :=
Map[{Last[listofXYpairs][[1]] + First[listofXYpairs][[1]],
0} + {-1, 1}*# &, listofXYpairs]

Options[xFlippedTicks] = {numPoints -> 10, digits -> "All"};

xFlippedTicks[listofXYpairs_, OptionsPattern[]] :=
Table[{Last[listofXYpairs][[1]] + First[listofXYpairs][[1]] - i,
If[OptionValue[digits] == "All", i,
Round[i, 10^-OptionValue[digits] // N]]}, {i,
First[listofXYpairs][[1]], Last[listofXYpairs][[1]], (
Last[listofXYpairs][[1]] - First[listofXYpairs][[1]])/
OptionValue[numPoints]-1}]

The numpoints variable sets how many points to show on the X axis. The digits variable can reduce the number of the digits shown in the tickmarks if needed to make the graph more useful or attractive.

The first graph above was produced with

ListPlot[returnPairs, PlotLabel -> "Expected IRR vs. Upfront cost",
AxesLabel -> {"Amt Paid", "IRR"}, Joined -> True, Mesh -> True]

Given these two functions, we can transform it to the second graph above with

ListPlot[xFlipper[returnPairs], PlotLabel -> "Expected IRR vs. Upfront cost",
Ticks -> {xFlippedTicks[returnPairs, numPoints -> 5], Automatic},
AxesLabel -> {"Amt Paid", "IRR"}, Joined -> True, Mesh -> True]

(The “automatic” in the Tick specification tells Mathematica to continue to handle the Y axis automatically.)

tagging time series plots in Mathematica

I hate legends on time series graphs. It’s invariably a hassle trying to figure out which color or which little marker corresponds to which line, and the legend takes up space and distracts from the data. Far better to tag the end of each data series on the graph itself, as in this example:

This is actually easy to do in Mathematica, especially if you have a standard datatype, as we do, that includes the name of the data series in its header.

Similarly, it is sometimes better to forgo datapoints entirely and to simply graph using the explanatory text itself.

Here is the code that produced the latter graph:

masterdata = {{"short person", 1, 2}, {"taller person", 2,
3}, {"the tallest", 3, 4}};

graphingdata = Map[{#[[2]], #[[3]]} &, masterdata];

ListPlot[graphingdata, PlotRange -> {{1, 3.5}, {1.5, 4.5}},
PlotMarkers -> Null,
PlotLabel -> "Often it is nice to use text rather than dots",
Epilog ->
Map[(Text[
Style[#[[1]], FontFamily -> "Times New Roman",
FontSize -> 9], {#[[2]], #[[3]]}, {-1.2, 0}] &), masterdata]]

Edward Tufte would approve.

Class and Survival on the Titanic

Detailed data about the passengers on the Titanic can be found widely online. I won’t replicate it all, though here is some basic class-level information about who survived the sinking on April 15, 1912.

data = {{"", "Lost", "Saved"}, {"Second Class Men", 147, 13}, {"Third Class Men", 399, 55}, {"Crew Men", 686, 189}, {"Third Class Children", 53, 23}, {"First Class Men", 115, 58}, {"Third Class Women", 81, 98}, {"Second Class Women", 15, 78}, {"Crew Women", 2, 21}, {"First Class Women", 5, 139}, {"First Class Children", 0, 5}, {"Second Class Children", 0, 24}};

first = RotateLeft[SortBy[Select[data, StringSplit[#[[1]]][[1]] == "First" &], First]];
second = RotateLeft[SortBy[Select[data, StringSplit[#[[1]]][[1]] == "Second" &], First]];
third = RotateLeft[SortBy[Select[data, StringSplit[#[[1]]][[1]] == "Third" &],
First]];
crew = SortBy[Select[data, StringSplit[#[[1]]][[1]] == "Crew" &], First];

A useful function:

Options[addSurvivalRatio] = {ratiodigits -> 2};

addSurvivalRatio[series_, OptionsPattern[]] :=
If[OptionValue[ratiodigits] == "none",
Map[{Last[StringSplit[#[[1]]]], #[[2]], #[[3]], #[[3]]/(#[[2]] + #[[3]]) // N} &, series],
Map[{Last[StringSplit[#[[1]]]], #[[2]], #[[3]], NumberForm[#[[3]]/(#[[2]] + #[[3]]) //N, {OptionValue[ratiodigits], OptionValue[ratiodigits]}]} &,
series]]

An overview, showing survivorship by class and man/woman/child status:

BarChart[Map[Transpose[addSurvivalRatio[#, ratiodigits -> "none"]][[4]] &, {first, second, third, crew}],
ChartElementFunction -> "FadingRectangle",
ChartLabels -> {Placed[{"First Class", "Second Class", "Third Class","Crew"}, Above], Placed[{"Men", "Women", "Children"}, Below, Rotate[#, Pi/2.4] &]},
PlotLabel -> "Survival Ratios\n"]

survivorship by class and sex

Walter Lord’s book A Night to Remember portrays third class passengers on the Titanic as nearly doomed, and suggests that some were effectively held belowdecks while first and second class passengers were evacuated to the lifeboats. This theme was picked up in James Cameron’s 1997 movie Titanic. And in Walter Lord’s 1998 book, The Night Lives On, he makes this explicit, with a chapter titled “What Happened to the Goodwins?” about a third class family that entirely perished. He implies that no similar losses were suffered in other classes.

Map[{#[[1]], NumberForm[#[[2]], 2]} &, overallTable] //
TableForm[#, TableHeadings -> {None, {"Group", "Survival"}}] &

overall survivorship table

Though it is true that third class as a whole did not do as well as first and second, the worst performing group on the ship was not among their number. The most fatal group that night was the second class men. Lord obscures this in his text by comparing the survivorship rate in third class to “first and second class” combined, thus hiding the fact that being a second class man on the Titanic was essentially a death sentence.

makeTable[group_] :=
Text[Grid[Prepend[addSurvivalRatio[group], titles2], Alignment -> {{Left, Right, Right, Right}, {Bottom, {Automatic}}}, Spacings -> {{Automatic, {2.2}}}, Background -> {Automatic, Automatic, {{2, 4} -> Pink}}, Dividers -> {None, {2 -> True}}]]

Grid[Transpose[{Map[Text[Style[#, Bold]] &, {"First Class", "Second Class","Third Class", "Crew"}], Map[makeTable[#] &, {first, second, third, crew}]}], Spacings -> {6, 3}, Alignment -> {Right, Automatic}]

how did the men do?

And while it is true that no children among the first and second class families perished, second class lost many families (Lilian and Earnest Carter, John and Sara Chapman, Harry and Shadrach Gale, Edgar and Frederick Giles, Leonard, Lewis and Stanley Hickman, Clifford and Ernest Jeffreys, William and Anna Lahtinen, and William and Dorothy Turpin).

Generally women did much better than men, even third class women did much better than first class men. It has been noticed by others that the survival rates among Americans was higher than among British citizens. Looking through the data, I notice that there were seven men on board with the title “Father,” all traveling second class, and they all died. Survivorship among priests was thus 0%, though their second-class status may well have been their primary risk factor.

With only 11 class/sex groups and only one sinking, there isn’t enough data to say much about the distribution of deaths, though it’s clearly not normally distributed. See the histogram below. A “fair” sinking would have low kurtosis, but kurtosis for this sinking was greater than 1.

Histogram[Flatten[Map[Transpose[addSurvivalRatio[#, ratiodigits -> "none"]][[4]] &, {first, second, third, crew}]], 12, PlotRange -> All,
PlotLabel ->
"Group survival probability not normally distributed,\nkurtosis " <>
ToString[NumberForm[Kurtosis[Flatten[Map[Transpose[addSurvivalRatio[#, ratiodigits -> "none"]][[4]] &, {first,second, third, crew}]]], 4]]]

histogram of survivorship by class and sex

Records have survived showing the fares paid by many passengers; the range was very wide even within each class. Converting the Pounds/shillings/pence used at the time to decimal pricing, and using a dummy variable of 1 to indicate survival, 0 to indicating death, we can run a regression to show the strength of wealth as a predictor of survival on the Titanic. I omit the raw data below because it requires so much space, but you should be able to find it online or e-mail me for a copy.

victimPaymentsDecimal =
Map[Total[# /. {s -> 1/20, d -> 1/(12*20)}] &, victimPayments]

survivorPaymentsDecimal =
Map[Total[# /. {s -> 1/20, d -> 1/(12*20)}] &, survivorPayments];

wDummies =
Join[Map[{#, 1} &, survivorPaymentsDecimal],
Map[{#, 0} &, victimPaymentsDecimal]];

dummyFit = Fit[wDummies, {x, 1}, x];

Show[ListPlot[wDummies, PlotRange -> All],
Plot[dummyFit, {x, 0, 520}, PlotStyle -> Pink], PlotRange -> All,
PlotLabel -> "Titanic Survivorship vs. Fare",
AxesLabel -> {"£", "Survivorship"}]

regression of survivorship vs. fare paid to the white star line

Lots of people died at all fare levels, though wealth is a significant predictor of survivorship. Not much changes if we omit the outliers who had paid more than 500 pounds sterling for their tickets. I’ll spare you the code and the graph, as it looks very similar to the above. However, for a final summary —

lm = LinearModelFit[wDummiesNoOutliers, x, x];

lm["ParameterTable"]

regression of survivorship vs. fare paid to the white star line

the earliest Mathematica logo

In the earliest Mathematica documentation, including the development pages visible in the scrapbook on the Wolfram website, an extremely early version of their evolving polyhedral logo is visible. It looks like this:
mathematica v1 logo

The image was, at the time, created with the following code:

In[]:= Needs["Graphics`Polyhedra`"]

In[]:= Show[Graphics3D[Stellate[Icosahedron[]]]]

This code would have worked unmodified for many releases of the software, through version 5 I suspect. Nowadays, however, to get the same output we would have to refer to the new library “PolyhedronOperations”, and use a variety of new options to shut off the otherwise automatic multi-coloration of the figure.


In[]:= Needs["PolyhedronOperations`"]

In[]:= Graphics3D[{Opacity[1], Glow[], FaceForm[Gray],
Stellate[PolyhedronData["Icosahedron", "Faces"]]},
Lighting -> "Neutral"]

And we get
mathematica logo 1 as rendered by Mathematica 8