# What are the most worthless subway stations in New York?

I was recently riding on the subway and it ocurred to me that some stations are much, much closer together than is typical. You can ride 30 blocks from West 42nd Street to West 72nd in a straight shot on the 2 or 3 lines, or 39 blocks between East 86th Street to East 125th Street on the 4 or the 5, but then sometimes you’re stuck mosying along between 110th and 116th, or between 14th and 18th. I heard a comedian make a joke about the pointlessness of stations four blocks apart, years ago. Nobody laughed, but I think he was onto something — some of these stops are very close together.

But how close? Scads of GIS files about New York City are freely available, and Mathematica has had the native ability to deal with these since version 7, though I’ve never used these functions. Let’s see what we can find out about the New York City Subways.

I used borough outlines from http://www.baruch.cuny.edu/geoportal/data/nymag/ and subway entrance locations from http://mta.info/developers/sbwy_entrance.html. I found three potentially tricky elements to the calculation \[LongDash] first, the subway entrances are not necessarily centered around their respective subway platforms; I chose a single entrance arbitrarily for each station but if I chose an outlying one, it will introduce a small error in my result. Second, the subway entrace locations are in {latitude,longitude} format (actually, {latitude,longitude}*1,000,000), and this is the format expected by Mathematica’s GeoDistance[] function, but the basemap is formated in {longitude, latitude} format, so we have to reverse the order of the elements in each location when we switch between mapping and determining distances. Third, the city’s database of subway entrances identifies an entrance as pertaining to a given line even if it is connected to that line only by an underground pedestrian tunnel. Since we care about what the trains are doing only, these have to be screened out by hand.

`stations = Import[NotebookDirectory[] <> "StationEntrances.csv"];`

``` ```

```allLines = Union[Flatten[Drop[stations, 1][[All, Range[4, 14]]]]]; allLines = Select[allLines, # != "" &]; index = Table[{i, allLines[[i]]}, {i, 1, Length[allLines]}]; ```
This function lets us extract all subway stations pertaining to a given line.

```matchLine[line_] := Union[Select[ Drop[stations, 1], #[[4]] == line || #[[5]] == line || #[[6]] == line || #[[7]] == line || #[[8]] == line || #[[9]] == line || #[[10]] == line || #[[11]] == line || #[[12]] == line || #[[13]] == line || #[[14]] == line &], SameTest -> (#1[[3]] == #2[[3]] &)]```

``` ```

`byLineUnique = Map[matchLine[#] &, allLines];`

Let’s graph it to see if this looks reasonable. Note that I specify the PlotRange to eliminate Staten Island, which has no subways.
``` bg = Import[NotebookDirectory[] <> "nymag_nyc_geog/nyc_pumas_2008.shp"];```

``` ```

```Show[bg, Graphics[{PointSize[Medium], Red, Point@(Map[Reverse, Drop[stations[[All, {25, 26}]], 1]/1000000.])}], PlotRange -> {{-74.05(*w*), -73.69(*e*)}, {40.54(*s*), 40.92(*n*)}}]```

Subway entrances as rendered by Mathematica

Looks good to me. We could easily graph each line in a difference color, connect the dots with lines, etc., but this will do for now.

The station entrance data includes all sorts of things we don’t care about, so let’s simplify it.

```vitalTable = Table[{index[[i]][[2]], Map[{#[[3]], {#[[26]], #[[25]]}/10^6} &, byLineUnique[[i]]] // N}, {i, 1, Length[byLineUnique]}];```

For the moment I don’t care about the order the stations are in, I’m going to check every station against every other one.

```vitalDistances = Table[{index[[i]][[2]], SortBy[Union[ Map[{#[[All, 1]], Round[GeoDistance[Reverse[#[[1]][[2]]], Reverse[#[[2]][[2]]]], .1]} &, Permutations[vitalTable[[i]][[2]], {2}]]], Last]}, {i, 1, Length[byLineUnique]}];```

Below is a list of subway stations combinations that aren’t traversed by the trains themselves and therefore shouldn’t be counted as “too close”. We’ll screen these out.

```disallowedCombos = {{"14 St", "6 Av"}, {"South Ferry", "Whitehall St-South Ferry"}, {"Chambers St", "Park Place"}, {"Atlantic Av", "Atlantic Av-Pacific St"}, {"Brooklyn Bridge-City Hall", "Chambers St"}, {"Franklin Av", "Botanic Garden"}, {"Botanic Garden", "Franklin Av"}, {"59 St", "Lexington Av/59 St"}, {"51 St", "Lexington Av/53 St"}, {"74 St-Broadway", "Jackson Heights-Roosevelt Av"}, {"14 St", "8 Av"}, {"62 St", "New Utrecht Av"}, {"Park Place", "World Trade Center"}, {"42 St-Bryant Pk", "5 Av"}, {"Lorimer St", "Metropolitan Av"}, {"Court Sq", "Court Sq-23 St"}, {"42 St-Port Authority Bus Terminal", "Times Sq-42 St"}, {"Chambers St", "World Trade Center"}, {"Borough Hall", "Court St"}};```

``` screenedVitalDistances = Table[{index[[i]][[2]], Select[SortBy[ Union[vitalDistances[[i]][[2]], SameTest -> (#1[[1]] == Reverse[#2[[1]]] &)], Last], ! MemberQ[disallowedCombos, #[[1]]] &]}, {i, 1, Length[byLineUnique]}]; ```

``` Map[{#[[1]], First[#[[2]]][[1]], First[#[[2]]][[2]], Last[#[[2]]][[1]], Last[#[[2]]][[2]]} &, screenedVitalDistances]```

The stations closest to each other and farthest from each other for each subway line in New York.

In the next post, I will demonstrate an interactive tool (which I have also uploaded to the Wolfram Demonstrations Project) that shows all the stations for a given line, marking the closest ones in green and the farthest ones in red, with tooltips identifying every station.