How do I use a GTFS feed?

How do I use a GTFS feed? - c#

I want to use a GTFS feed in Google Maps, but I don't know how to. I want to display the buses available from a route. Just so you know, I'm planning on implementing the Google Map I make in a Visual C# application.

This is a very general question, so my answer will necessarily be general as well. If you can provide more detail about what you're trying to accomplish I'll try to offer more specific help.
At a high level, the steps for working with a GTFS feed are:
Parse the data. From the GTFS feed's URL you'll obtain a ZIP file containing a set of CSV files. The format of these files is specified in Google's GTFS reference, and most languages already have a CSV-parsing library available that can be used to read in the data. Additionally, for some languages there are GTFS-parsing libraries available that will return data from these files as objects; it looks like there's one available for C#, gtfsengine, you might want to check out.
Load the data. You'll need to store the data somewhere, at least temporarily, to be able to work with it. This could simply be a data structure in memory (particularly if you've written your own parsing code) but since larger feeds can take some time to read you'll probably want to look at using a relational database or some other kind of storage you can serialize to disk. In the application I'm developing, a separate process parses and loads GTFS data into a relational database in one pass.
Query the data. Obviously how you do this will depend on the method you use for storing the data and the purpose of your application. If you're using a relational database, you will typically have one table per GTFS entity (or CSV file) on which you can construct indices and against which you can execute SQL queries. If you're working with objects in memory, you might construct a hash-table index in memory as well and query that to find the data you need.

Related

What is the easiest way to store and query data without using a database?

I am fairly new to ASP.Net MVC, which is why I could use some direction.
I am building a site for a client that is not using a Database.
I have several (~20) youtube videos I would like to embed. The client is no longer producing these videos and this list will not be updated often. I have created a template view for the video and information. I would like to setup a model that can query a youtube video from the data set.
My initial thought is to create a JSON File, and a model class to query the information. Is that the best way to accomplish this?

JSON seems like a great idea to me. With only about 20 records total, you're near the point where it doesn't even make sense to be data driven: just have 20 static pages with shared css and google custom search engine for queries. However, I still tend to prefer relying on a data source whenever I can, and I like JSON for this.
JSON will work well here because you can use a *.js file that will be cached by most browsers, and you can execute your searches on data without even needing to refresh the page. Especially if you're using a templating system like Knockout or Ember, you can have this be entirely a client application: no server code. Such an application would be very fast from the user perspective, especially if you use a cdn for the template engine, such that many users will already have it cached on first load.

You can use XML document to store structured data, load it, and use XPath to query it (be mindful of XPath Injection vulnerabilities). Or use the same XML to deserialize into a data model and use LINQ to query it.
(B/w, this is by far not the only option - just one-and-a-half that comes immediately to mind)

I would put the data in flat text file of my preferred format (personally json also) and then I would deserialize that into a list of objects and use LINQ queries on it. Given the small amount of data in question I would use a flat file in favor of a database even if I had the option.
You could also use a resx file as part of the project or the in built settings as suggested in the comments. Regardless of how you do it, the amount of data is small enough that you may as well just read it into a collection in memory and then query that collection.

Since it doesn't need to be updated very often, an easy approach would be to just create a hard coded list in code that's used to generate links from. If you want to be able to update the links in the future without modifying code then XML or JSON are likely your best bets.

SSIS Transform Component: Large Scale Data Storage

I'm developing an SSIS Transform component that will need to store the contents of the incoming data stream and then output the data at a later point in time. This could be a large number of records with many fields (of any data type).
For example, this type of storage would be needed if you were developing a 'Sort' component, where you cannot output a single record until all records have been input.
My question is - what is the recommended practice for storing this temporary data? The Microsoft and Codeplex examples I've seen are somewhat trivial in that they use in-memory structures. I would like to avoid this, as this would seem to be a very bad idea when working with large data sets.
Is there a mechanism in the SSIS library to do this? [okay, it looks like there is not]
I'm considering a few options:
Store the data on disk in a stream,
keeping the record offsets into this
stream in memory. During the output
phase, I'll use these offsets to
locate the desired record.
Store the
data in an ADO or OLEDB data source
of the user's choosing.
Other suggestions?

No - there is no 3rd-party accessible "buffering" mechanism exposed in the API. You're responsible for it yourself, including paging to disk or whatever mechanism you choose to avoid storing all of the rows in memory.

Efficient way to analyze large amounts of data?

I need to analyze tens of thousands of lines of data. The data is imported from a text file. Each line of data has eight variables. Currently, I use a class to define the data structure. As I read through the text file, I store each line object in a generic list, List.
I am wondering if I should switch to using a relational database (SQL) as I will need to analyze the data in each line of text, trying to relate it to definition terms which I also currently store in generic lists (List).
The goal is to translate a large amount of data using definitions. I want the defined data to be filterable, searchable, etc. Using a database makes more sense the more I think about it, but I would like to confirm with more experienced developers before I make the changes, yet again (I was using structs and arraylists at first).
The only drawback I can think of, is that the data does not need to be retained after it has been translated and viewed by the user. There is no need for permanent storage of data, therefore using a database might be a little overkill.

It is not absolutely necessary to go a database. It depends on the actual size of the data and the process you need to do. If you are loading the data into a List with a custom class, why not use Linq to do your querying and filtering? Something like:
var query = from foo in List<Foo>
where foo.Prop = criteriaVar
select foo;
The real question is whether the data is so large that it cannot be loaded up into memory confortably. If that is the case, then yes, a database would be much simpler.

This is not a large amount of data. I don't see any reason to involve a database in your analysis.
There IS a query language built into C# -- LINQ. The original poster currently uses a list of objects, so there is really nothing left to do. It seems to me that a database in this situation would add far more heat than light.

It sounds like what you want is a database. Sqlite supports in-memory databases (use ":memory:" as the filename). I suspect others may have an in-memory mode as well.

I was facing the same problem that you faced now while I was working on my previous company.The thing is I was looking a concrete and good solution for a lot of bar code generated files.The bar code generates a text file with thousands of records with in a single file.Manipulating and presenting the data was so difficult for me at first.Based on the records what I programmed was, I create a class that read the file and loads the data to the data table and able to save it in database. The database what I used was SQL server 2005.Then I able to manage the saved data easily and present it which way I like it.The main point is read the data from the file and save to it to the data base.If you do so you will have a lot of options to manipulate and present as the way you like it.

If you do not mind using access, here is what you can do
Attach a blank Access db as a resource
When needed, write the db out to file.
Run a CREATE TABLE statement that handles the columns of your data
Import the data into the new table
Use sql to run your calculations
OnClose, delete that access db.
You can use a program like Resourcer to load the db into a resx file
ResourceManager res = new ResourceManager( "MyProject.blank_db", this.GetType().Assembly );
byte[] b = (byte[])res.GetObject( "access.blank" );
Then use the following code to pull the resource out of the project. Take the byte array and save it to the temp location with the temp filename
"MyProject.blank_db" is the location and name of the resource file
"access.blank" is the tab given to the resource to save

If the only thing you need to do is search and replace, you may consider using sed and awk and you can do searches using grep. Of course on a Unix platform.

From your description, I think linux command line tools can handle your data very well. Using a database may unnecessarily complicate your work. If you are using windows, these tools are also available by different ways. I would recommend cygwin. The following tools may cover your task: sort, grep, cut, awk, sed, join, paste.
These unix/linux command line tools may look scary to a windows person but there are reasons for people who love them. The following are my reasons for loving them:
They allow your skill to accumulate - your knowledge to a partially tool can be helpful in different future tasks.
They allow your efforts to accumulate - the command line (or scripts) you used to finish the task can be repeated as many times as needed with different data, without human interaction.
They usually outperform the same tool you can write. If you don't believe, try to beat sort with your version for terabyte files.

How can I save large amounts of data in C#?

I'm writing a program in C# that will save lots of data points and then later make a graph. What is the best way to save these points?
Can I just use a really long array or should I use a text file or excel file or something like that?
Additional information: It probably wont be more than a couple thousand. And it would be good if I could access it from a windows mobile app. Basically a user will be able to save times that something happens at, and then the app will use the data to find a cross correlation.

If it's millions or even thousands of records, I would probably look at using a database. You can get SQL Server 2008 Express for free, or use MySQL, or something like that.
If you go that route, LINQ to SQL makes database access a piece of cake in .NET. Entity Framework is also available, but LINQ to SQL probably has a quicker time-to-implement.

If you use a text file or excel file, etc. You'll still need to load them back into memory to plot the graph.
So if you're collecting data over a long period of time, or you want to plot the graph some time in the future, write them to a plain text file. When you're ready to plot the graph, load the file up and plot the graph.
If the data collection is within a short period of time, don't bother writing to a file - it'll just add steps to the process for nothing.

A really easy way of doing this would be to serialize your object list into a BinaryWriter or XMLWriter, which automatically format your data into a readable and writable format so that, when your program needs to load the data, all you have to do is deserialize it (1 line of code).
Alternatively, if you have very many records, I suggest trying to use a database. It's quite easy to interface C# with SQL Server (there's a free version called Express Edition) or MySQL, and storing and retrieving huge amounts of data is not a pain. This would be the most efficient way to accomplish your task.
Depending on how much data you have and whether you want to accomplish something like this with 1 line of code (serialization) or interface with a seperate product (the database approach), you can choose either one of the above. Of course, if you wanted to, you could just manually write the contents of your data to a text file or CSV file, as you suggested, but, from personal experience, I recommend the methods I explained above.

It probably wont be more than a couple thousand. And it would be good if I could access it from a windows mobile app. Basically a user will be able to save times that something happens at, and then the app will use the data to find a cross correlation.

Is there any need for interoperability with other processes? If so, time to swat-up on file formats.
However, from the sound of it, you're asking on a matter of "style", with no real requirement to open the file anywhere but your own app. I'd suggest using a BinaryWriter for the task.
If debugging is an issue, a human-readable format might be preferable, but would be considerably larger than the binary equivalent.
Probably the quickest way to do it would be using binary serialization.

Storing settings: XML vs. SQLite?

I am currently writing an IRC client and I've been trying to figure out a good way to store the server settings. Basically a big list of networks and their servers as most IRC clients have.
I had decided on using SQLite but then I wanted to make the list freely available online in XML format (and perhaps definitive), for other IRC apps to use. So now I may just store the settings locally in the same format.
I have very little experience with either ADO.NET or XML so I'm not sure how they would compare in a situation like this.
Is one easier to work with programmatically? Is one faster? Does it matter?

It's a vaguer question than you realize. "Settings" can encompass an awful lot of things.
There's a good .NET infrastructure for handling application settings in configuration files. These, generally, are exposed to your program as properties of a global Settings object; the classes in the System.Configuration namespace take care of reading and persisting them, and there are tools built into Visual Studio to auto-generate the code for dealing with them. One of the data types that this infrastructure supports is StringCollection, so you could use that to store a list of servers.
But for a large list of servers, this wouldn't be my first choice, for a couple of reasons. I'd expect that the elements in your list are actually tuples (e.g. host name, port, description), not simple strings, in which case you'll end up having to format and parse the data to get it into a StringCollection, and that is generally a sign that you should be doing something else. Also, application settings are read-only (under Vista, at least), and while you can give a setting user scope to make it persistable, that leads you down a path that you probably want to understand before committing to.
So, another thing I'd consider: Is your list of servers simply a list, or do you have an internal object model representing it? In the latter case, I might consider using XML serialization to store and retrieve the objects. (The only thing I'd keep in the application configuration file would be the path to the serialized object file.) I'd do this because serializing and deserializing simple objects into XML is really easy; you don't have to be concerned with designing and testing a proper serialization format because the tools do it for you.
The primary reason I look at using a database is if my program performs a bunch of operations whose results need to be atomic and durable, or if for some reason I don't want all of my data in memory at once. If every time X happens, I want a permanent record of it, that's leading me in the direction of using a database. You don't want to use XML serialization for something like that, generally, because you can't realistically serialize just one object if you're saving all of your objects to a single physical file. (Though it's certainly not crazy to simply serialize your whole object model to save one change. In fact, that's exactly what my company's product does, and it points to another circumstance in which I wouldn't use a database: if the data's schema is changing frequently.)

I would personally use XML for settings - .NET is already built to do this and as such has many built-in facilities for storing your settings in XML configuration files.
If you want to use a custom schema (be it XML or DB) for storing settings then I would say that either XML or SQLite will work just as well since you ought to be using a decent API around the data store.

Every tool has its own right
There is plenty of hype arround XML, I know. But you should see, that XML is basically an exchange format -- not a storage format (unless you use a native XML-Database that gives you more options -- but also might add some headaches).
When your configuration is rather small (say less than 10.000 records), you might use XML and be fine. You will load the whole thing into your memory and access the entries there. Done.
But when your configuration is so big, that you dont want to load it completely, than you rethink your decission and stay with SQLite which gives you the option to dynamically load those parts of the configuration you need.
You could also provide a little tool to create a XML file from the DB-content -- creation of XML from a DB is a rather simple task.

Looks like you have two separate applications here: a web server and a desktop client (because that is traditionally where these things run), each with its own storage needs.
On the server side: go with a relational data store, not Xml. Basically at some point you need to keep user data separate from other user data on the server. XML is not a good store for that.
On the client: it doesn't really matter. Xml will probably be easier for you to manipulate. And don't think that because you are using one technology in one setting, you have to use it in the other.

We Keep Coding

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.