Nov 11, 2007
NF1 Objects
The best place to start is with data structures — what is the best way to represent dynamic data, including financial data, in Mathematica? Most people would probably start with a list of date/data pairs, like this
data1 = {{"1/1/2000", 37}, {"1/2/2000", 39}, {"1/3/2000", 33.2}};
This data can be graphed, curves can be fit against it, it can be compared to, or manipulated in light of, other datasets. Experience shows, however, that this simple data structure soon grows complicated. We quickly have data1, data2, data3, and various manipulated versions of them, and it can be a mess keeping track of which of these represent what. And, especially when generating tables of computed characteristics, or labelling graphs, it becomes onerous to attach the right name to the right data series.
Imagine, for example, that you have 20 series representing the prices of stocks over time, and a function that checks the growth rate of each. You want to run this function on each series and report back a table listing the name of each stock and its growth rate, in declining order of growth. There is no easy way to do this with the simple data structure shown above; we have the information needed to compute the growth rate, but we don't have the name of the stock.
The best solution is to eschew the simple list of data/data pairs, and to standardize on a slightly more complex data structure that includes the name of the data series and other information of use. The particular structure I use is called "nf1", and looks like this:
{{formatcode, name, timeunit, type}, {{date, data}, {date, data}, etc.}}
where formatcode, in this case -> "nf1", name -> "IBM US Equity", for example, timeunit -> "monthly", and type -> "pctchanges" (or "delta," or "value"). I have written code that converts data from all my usual data sources to this format, and some VBA that is useful for converting random spreadsheets sent to me by others.
Most of the sample code that will follow on these pages will assume that the data is in nf1 format.
Over the next few days, I will post some useful functions for working with nf1 objects, starting with nf1OK, which does a quick check of any variable passed to it, to see if it is a properly formatted nf1 object.
nf1OK[object_] :=
Module[{result = True},
If[Length[object] != 2, Message[nf1OK::"twoparts", Length[object]];
False, If[Length[object[[1]]] != 4,
Message[nf1OK::"headlengths", Length[object[[1]]]]; False,
If[object[[1]][[1]] != "nf1",
Message[nf1OK::"typecode", object[[1]][[1]]]; False,
If[! StringQ[object[[1]][[2]]],
Message[nf1OK::"nametype", object[[1]][[2]]]; False,
If[! MemberQ[StringMatchQ[{"year", "quarter", "monthly", "week", "day",
"hour", "minute", "second"}, object[[1]][[3]]], True],
Message[nf1OK::"timeunit", object[[1]][[3]]]; False,
If[! MemberQ[StringMatchQ[{"pctchanges", "delta", "value"}, object[[1]][[4]]], True],
Message[nf1OK::"timeunit", object[[1]][[4]]]; False,
If[MemberQ[(If[! (VectorQ[#1] && Length[#1] == 2), True, False] &) /@ object[[2]], True],
Message[nf1OK::"datalist", object[[1]][[2]]]; False,
If[Length[object[[2]]] < 1, Message[nf1OK::"datalength", object[[1]][[2]]]; False,
If[MemberQ[(Module[{std}, std = SternDateTypeNF1[object];
std?"M" && std?"MI" && std?"C"] &) /@ object, True],
Message[nf1OK::"dateformat", object[[1]][[2]]]; False,
If[MemberQ[Module[{std, datelist}, std = SternDateTypeNF1[object];
datelist = Transpose[object[[2]]][[1]];
If[std == "M", (If[Total[#1] == 0, True, False] &) /@ datelist];
If[std == "MI", (If[#1 < 1, True, False] &) /@ datelist];
If[std == "C", (If[StringLength[#1] < 1, True, False] &) /@ datelist]], True],
Message[nf1OK::"datecheck", object[[1]][[2]]]; False,
If[MemberQ[Module[{std, datelist}, std = SternDateTypeNF1[object];
datelist = Transpose[object[[2]]][[1]];
If[std == "M", (If[! VectorQ[#1], True, False] &) /@ datelist];
If[std == "MI", (If[! IntegerQ[#1], True, False] &) /@ datelist];
If[std == "C", (If[! Head[#1] == String, True, False] &) /@ datelist]], True],
Message[nf1OK::"nulldates", object[[1]][[2]]]; False,
If[MemberQ[(If[! NumericQ[#1], True, False] &) /@
Transpose[object[[2]]][[2]], True],
Message[nf1OK::"nulldata", object[[1]][[2]]], True]; True]]]]]]]]]]]]
It is always good form to include a functional definition.
nf1OK::usage = "nf1OK[object_] checks an object to see if it looks like a good NF1 object. Returns an error if it finds one, or True if all is well.";
And error messages appear below.
nf1OK::twoparts = "an NF1 object must have two parts. You have `1`.";
nf1OK::typecode = "The typecode of this object reads `1` rather than nf1.";
nf1OK::headlengths = "the header to an NF1 object must have four parts. You have `1`.";
nf1OK::nametype = "The name of this object needs to be a string. You have `1`.";
nf1OK::timeunit = "The timeunit needs to be year, quarter, monthly, week, day, hour, minute, or second. You have `1`.";
nf1OK::seriestype = "The type needs to be pctchanges, delta, or value. You have `1`.";
nf1OK::datalength = "You need at least one date/data pair in object `1`.";
nf1OK::datecheck = "You can not have any dates of 0 in object `1`.";
nf1OK::nulldates = "You have malformed or null dates in object `1`.";
nf1OK::nulldata = "You can not have null or non-numeric data in object `1`.";
nf1OK::datalist = "There is problem with the list of date/data objects in object `1`.";
nf1OK::dateformat = "Can not figure out date format in object `1`.";
Advanced placement observation 1 — the "timeunit" code in the nf1 header has never proven to be useful, as every datapoint has a timestamp anyway. I can imagine scenarios in which it would be useful, but in practice I have never used it.
Advanced placement observation 2 — the code above depends on a function called "SternDateTypeNF1," which I have not yet explained. That will come soon.

