Typical data conversion tasks require processing numerous input files that arrive in batches. Altova MapForce includes features that let you handle groups of files with minimal intervention.For instance, we recently copied a set of files from the memory card of a digital camera with GPS support. Each .LOG file is a CSV containing GPS coordinates for a single route.
We quickly designed a mapping to convert the CSV data to XML-based .gpx format and processed all three files to generate three output files in a single execution:
First, we used a wildcard character in the Input file name in the Properties dialog for the input component of the mapping. This instructs MapForce to individually process every file in the working directory that matches the wildcard.
If you are designing a complex conversion, or if the input files are very large, you can use a single unique filename to develop the mapping, then change to a wildcard when you are satisfied with the mapping output.
File Path Functions
The built-in MapForce Function Library includes file path functions we can use to manage output file names. If we define a single output file, it will be appended with new data when we process each successive input.
You can combine file path functions with other string functions for complete control of output file names and locations. We decided to leave the output in the same directory as the input files, but to create more descriptive filenames, and to use the .gpx file extension.
The portion of the mapping shown below uses the string concat function with file path functions to generate the output file 1211190converted.gpx from 1211190.LOG, and so on.
You can also use file path functions to generate strings and insert them as output. The XML Schema for .gpx files contains a metadata description element. We decided to insert the input file name into the metadata to explicitly link the output file to the original data. This strategy makes the output file self-documenting, and can help with debugging if you need to trace unexpected output back to the original source.
The portion of the mapping shown below inserts the source file name into a string and maps the string to the metadata <desc> element:
The core of this data mapping required a filter on the input file. The camera GPS log files are recorded according to the National Marine Electronics Association (NMEA) specification. A portion of one of the input files is shown below:
After the first line, each recorded point is described by two NMEA sentences, where the sentence type is identified in the first field. Each GGA sentence includes the time, latitude, longitude, elevation, and additional data about the quality of the fix. Each RMC sentence contains the time, latitude, longitude, and date.
An RMC sentence contains the minimum data we need to generate a .gpx <trkpt> element, so we can use a filter to select only those lines from the input, as shown here:
If the message type in the first field of a row contains “$GPRMC” it is passed through for processing. If not, the row is ignored.
The actual data in the input file also required some manipulation. For each latitude and longitude, we had to combine multiple fields in the source that defined degrees, minutes, and seconds and convert to decimal degrees. We needed to combine the time and date fields and record the result in ISO 8601 format as required by .gpx, such as 2012-11-19T20:43:23Z. We defined each of those conversions as user functions to encapsulate their complexities and separate them from the main mapping.