Showing posts with label KML. Show all posts
Showing posts with label KML. Show all posts

Friday, July 24, 2015

Google Earth and KML - Part Two (GETECH)

In the previous post I introduced the topic of working with geographic data in the KML format. As mentioned there, KML is the native data format used by Google Earth (GE) and it is widely supported by both GPS devices and software. I won't attempt to cover all the gory details of working with and editing KML in this series of posts (resource links at the end), but there a few practical things I've learned about working with GPS data in the KML format that I want to share.

As a basis for this discussion I'm looking at two GPS tracks I recorded on a short Kayak trip on the Kayaderosseras Creek near Saratoga Springs. One was recorded using my trusty Garmin GPS and the other was recorded using the GPS receiver in my camera; a Canon Sx260. I'll discuss and compare the accuracy of the two tracks and touch on some other issues you might encounter while using GPS to collect geo-referenced data for use in your projects.

The first issue is evident in the screen shot below (Image 1). The red line represents the track captured by the GPS receiver in the camera. The purple line shows the route recorded by the Garmin. The obvious difference is that the purple line extends beyond the point where the red line stops. That's because the red line stops at the place where I put the kayak into the creek and later took it out (at the Spa State Park Canoe Launch Site on Driscoll Road). I paddled upstream for a mile or so and then floated back down. I had both GPS units turned on and both units recorded the path. The purple line continues because when I got back to the launch point I forgot to turn off the Garmin GPS. So it recorded a track for the short carry from the creek up to the parking area and for my drive home. That data is not useful so the first thing I want to do is remove it.

Image 1: Screen capture from Google Earth. The two lines (red and purple) are GPS tracks recorded simultaneously on Kayak trip on the Kayaderosseras Creek near Saratoga Springs (NY)

I could do that by editing the track directly in Google Earth. That would be easy but it would be extremely tedious. There are several mouse clicks required to delete a point and there are a lot of points to get rid of to clean up this problem. You can see this in image two. For this screen capture I turned off the display of the track recorded by the camera (the red line) and selected the purple track for editing This makes visible the points recorded by the GPS. Each recorded point is represented by a red dot in the image and as you can see, there are a lot of points in just this short section of the route from the creek to my house. To remove this part of the track by editing it in GE you'd click on each points, select Delete, confirm the deletion and repeat. Lots of repeats.

Image 2: Screen capture from Google Earth. With the display of the track recorded by the camera (the red line) off, the track represented by the purple line was selected for editing. This allows us to see the individual points recorded by the GPS receiver. 
A better alternative is to open the KML file containing the track in a text editor and remove the unwanted points from the file. Once you have the hang of it you can edit the data in the file very quickly. There are few steps involved in this process which goes something like this:
  1. Export the track to a file. In Google Earth right click on the item in the Places pane, select Save Place As. Be sure to select KML from the format drop down so the file is saved in the KML format.
  2. Open the file in a text editor or an XML editor (I use Notepad++; a free and open source editor that is popular with programmers). You may need to configure your editor to recognize that the KML file is an XML file. That should allow the editor display the contents with nice formatting.
  3. In this case, removing the points I don't want comes down to removing all the points where the latitude is greater than the latitude of the put-in point. I'm in the northern hemisphere --so the latitude coordinates increase as you go north-- and all the points I want to keep are south of the put-in. Image three shows the map zoomed in on that location with the properties dialog for the point displayed. The properties dialog shows the latitude and longitude of the marker (and the location). This is how I got the latitude I will use as the basis for my edits; any point with a latitude greater than 43.034293 can be removed from the file. A potential complication is that that the coordinates in the KML file are stored using the decimal degree format. That's as you see it in the dialog seen in image three, but that representation is not the default for coordinates in Google Earth. The default is to represent coordinates in the Degree/Minute/Second format; like this: 43°02'03.4548" You can change the representation used by Google Earth using the Options dialog on the Tools menu.


Image 3: Screen capture from Google Earth. Viewing the properties of a marker added at the location of the start/end point of the trip. The properties dialog shows the coordinates of that point.

Once you have a KML file containing your track, and you have a way to identify the point(s) to remove (in this case, latitude > 43.034293) you can remove the unwanted points from the file. To do this you do need to know a little bit about the XML data format and how to edit data stored in XML, but there are many introductions to XML available on-line and I've included references at the end of this post. Also, if you've done any work with HTML this will all look familiar. XML data elements are placed inside of tag pairs and you must delete the entire tag pair or you wil break the formatting of the file (and GE will complain when you try to reload it). You'll also need to recognize how the coordinates are represented. An example from my file looks like this:

<gx:coord>-73.79277999999999 43.034036 102.52</gx:coord>

The coordinate of the location is represented by three values with a space placed between each value (it might be hard to see the space; -73.79277999999999<space>43.034036<space>102.52). The values are:

-73.79277999999999longitudea "west" longitude represented as a negative number
43.034336latitudea "north" latitude represented as a positive number
102.52elevationmeters above sea level

As discussed in the previous section, I want to delete any point where the latitude is greater than 43.034293, so this point can be removed from the file. The points are stored in the file in order, so once you find the place break between what you want and what you want to remove you can select all the bad rows and remove them. If editing data in this way is familiar to you then this should get you started. If this is entirely new then you may need more help. Look at the references or ask a friend who knows about this stuff.

With unwanted data removed from your file there is one final step; opening the file in Google Earth so you can see the results of your work. In Google Earth select "Open" from the File menu and find your edited file on disk. If you get warnings about errors check the file over for mis-matched tag pairs. That's it, you can save the file in your Places folder so that Google Earth reopens it automatically or you can reopen the file when you need it in the future.

This post is already rather long so I'm going to cover the other topics I raised at the start in a separate "part two" posting.



Sunday, July 19, 2015

Using Google Earth for Geographic Exploration (GETECH)

Readers of this blog know that I use Google Earth (GE) to plan outings and to manage data that I collect using various GPS devices. I use Google Earth as a sort of entry-level Geographic Information System (GIS) and in the following series of posts I plan to share what I've learned about using Google Earth as part of a citizen/community science data management system. I'll probably move these posts out to a separate blog in the future but to get it started I'm putting them here. If you aren't interested in these more technically oriented posts you can skip them. I''ll stick (GETECH) on the title to make them easy to recognize.

On most of my outings I collect geo-referenced data using one or more GPS devices. I have a Garmin hand-held GPS, GPS in my phone and even a camera with GPS built-in. I collect "tracks", the GPS receiver stores a point every few seconds allowing you to trace your route, and I also save WayPoints. These are specific locations of interest and the GPS saves the coordinates along with a name you enter. The Garmin GPS can grab a bunch of coordinate pairs in quick succession and average them to improve the accuracy of the fix. This is useful if you want to get the most accurate location that your device can record, usually around ten feet. I'll do a future post on GPS accuracy to explain this limitation.

After a trip I upload the data from my GPS devices into GE. This allows me to see the tracks (my route) and waypoints in the geographic context provided by GE. If the data was collected for a specific use I move it into whatever storage system I'm using for that project (more on this in a future post). My Garmin GPS is directly supported by GE so I simply plug the GPS unit into a USB port and use the GPS option on the Tools Menu. There are options to allow you select what you want to import. By default GPS data is added into a sub-folder of the "Temporary Places" folder in your GE Places. I typically tweak things a bit and move the imported data into the "MyPlaces" folder. This allows me to save each GPS data set for ready access in GE. You could also right click on your newly uploaded dataset and save it to your computer using the Save Place As option.

It's even easier to grab the GPS data using the phone. I have several GPS related apps on my Android phone (a Google Nexus 5) but I most often use GPS Essentials. As with the dedicated Garmin, GPS Essentials allows me to record tracks and save waypoints (it has other capabilities that I'll discuss in future posts). To get the data into GE I use the Export feature of GPS Essentials and save the data directly to Google Drive (Internet-based storage). It takes a few minutes but once the data shows up in Google Drive I save it to my local computer as a KML file. KML is the native file format of Google Earth and most software that uses geographically referenced data can use data in the KML format.

This has been my basic working process for several years but I've recently become concerned that it was breaking down. The data I collect is best categorized as "ecological inventory". I note where species are found, when I saw them and I record additional information about the circumstances. The "where" and "when" are essential to the data having value and I was not managing this information in a consistent way. It's there --embedded in the KML files-- but I didn't have an easy way to find everything related to a location; or everything for a specific time span. Some of data was stored in raw files and some was in databases that I've created. To maximize the value the data needs to be consolidated and I need a more robust process for recording and managing the metadata. Also, with over 200 top-level sub-folders in my MyPlaces folder, Google Earth was running noticeably slower.

Screen capture of Google Earth after I cleaned up and organized my MyPlaces data. The lines are tracks captured using GPS and the markers are points of interest.

So I looked into how GE actually manages this data and found that all of the data you see in GE in the "MyPlaces" folder is actually stored as a single file on your computer named myplaces.kml. The location of this file varies based on the operating system of your computer and the version of GE you are using. On my computer, running Windows 7 and using Google Earth Pro, the file is saved to:
C:\Users\<your-windows-name>\AppData\LocalLow\Google\GoogleEarth

Seeing the MyPlaces.kml file made it clear why GE was straining a bit. The MyPlaces.kml file on my computer was just under 50 megabytes in size and using a text editor to open it I found that it contained over 1.1 million lines. For context, if we assume that a page of text averages around 60 lines, the MyPlaces file on my computer contained over 18,000 pages. That's a bit much.

I had let the situation get out of hand and now it was going to require a lot of work to straighten it out. One approach would be use the GE user interface. In GE you can save each folder or item in the MyPlaces folder to a separate file on your computer.  You can then delete that item from MyPlaces and open the file when you want to use that data. If the item you save is a folder GE creates a single archive file with the (.KMZ extension) containing the entire contents of the folder. This approach would work but it was going to be a tedious process to say the least.

A second alternative is to edit a copy of the MyPlaces.kml file using a text editor (note that I said "edit a copy" - don't edit the file used by GE). You can use any text editor to open a KML file. Unfortunately, given the size of the file on my computer this also was going to be a slow and error prone process. Using NotePad++ --an editor design for working with lots of text-- opening and navigating the file was extremely sluggish (recall the 1.1 millions lines). I could have copied out sections and worked on pieces of the file but the data is structured (it's an XML file if you are familiar with that sort of thing) and there are references among different sections in the file. It would be very easy to mess this up.

The solution I chose was to write a script to run through the file (a copy of course) and save each folder, document and placemark to a new, separate, KML file. I could then remove (making a copy) the MyPlaces.kml file used by GE (make sure that GE is not running if you do this). The next time GE is run it creates a new and empty MyPlaces.kml file and you are back to a "clean" install of GE. The separate KML files created by the script can then be opened using GE when I need them.

I'm making this script available for anyone who wants to run it and you can access here:
https://gist.github.com/kentstanton/3441cc368d3c52621b19

Please note that you must have PowerShell 5.0 installed on your computer to run the script and you need to know how to run PowerShell scripts. If you are familiar with PowerShell programming you can alter the script as needed. And it would not be too difficult to port the code to a different language if you do not have a Windows computer to run it on. The script requires Windows PowerShell 5.0; the very latest version (as of this writing). I'm working on a larger project that will incorporate this functionality and remove this requirement but I wanted to go ahead and make this available now because it might be useful to some people as is.



Wednesday, May 20, 2015

Moreau Lake State Park Boundary (KML) for Google Earth

This is a follow up to my recent post on the addition of lands formerly held by the McGregor Correctional facility to Moreau Lake State Park. That post included a map image with a rough outline of the newly added parcel. I've refined that boundary and saved it to a KMZ format file that can be opened in Google Earth/Maps.

Link to the Moreau Lake State Park Boundary File
https://drive.google.com/file/d/0B3QHBL0PwDqWZFpxSG9nYU1BY0E/view?usp=sharing

Clicking the link will open a new browser window (or tab) showing Google Maps with the boundary layer displayed as an overlay. You can also download the boundary from Google Maps by clicking the Download Button (down arrow icon) on the Google Maps toolbar. The file is in the KML/KMZ format native to Google Earth/Maps. If you have Google Earth installed on your computer you can double click the downloaded file it to open it.

Image 1: Moreau Lake State Park Boundary.
Note: These boundaries are approximate and not official in any way. (Base map source: Google Earth)
The red outline represents the boundary of the main section of Moreau Lake State Park. The newly added "Lake Bonita" section is shown in blue (lower left). The area in brown (near the top) is the section along the Hudson River and north of Spier Falls Road, The section with the green outline is the area east of the Hudson (previous post on this here).

Image 2: The same boundary outlines viewed as an overlay in Google Maps. 
Viewing the boundary layer in Google Maps you might wonder why the outlines I've provided don't match the park boundary shown tinted light green. Most striking is that Maps only shows the main section of the Park. Probably because the data that Google used as the basis for the parks layer was created before the section east of the river was added. That question of "when was the data created" also applies to the new Lake Bonita section. The boundary outline I've provided was digitized from images provided with the press release announcing the land transfer (so again, it is only approximate).

Perhaps the most interesting discrepancy is the area outside of the red outline seen just below and left of Moreau Lake. That's not newly added so, is it in the park or outside of the park? It's probably a data error in Google Maps. No other source that I've seen includes that acreage in the Park and if you zoom in, as seen in Image 3, we see that there are streets located there. I know from having been there that there are houses on those streets so that area is probably not inside the State Park.

Image 3: The section in green, but outside of the red outline, is probably a data error. (Source Google Maps)
The outlines I've shared were digitized from base map layers obtained from several sources. My starting point was the official Park map and I edited and added detail using a base map layer from ArcGIS Online (Esri). My point in discussing these discrepancies is that geographic data usually represents a moment in time. If our subject of interest is represented by data that changes continuously then we can consider that data to be out-of-date the second we save it to a database. Even something far less dynamic --a State Park boundary for example-- will change over time and representations provided by varied sources will not necessarily be in agreement. And then there are errors of all kinds and the question of accuracy. But that's another whole story.

As a representation of the real world geographic data is uncertain. And uncertain is not the opposite of certain. It's a lot more complicated than that.