data (#23) - Performance and memory usage hint (#282) - Message List
Hi all,
I thought I would share a helpful hint after trying to figure out why the shortest_path performance on some of my tables was much slower than with other tables, and was consuming a huge amount of RAM (and swap) in the process.
Let's say you have a table with 50,000 edges. Run two quick queries 'SELECT source FROM <yourtable> ORDER BY source DESC LIMIT 50' and 'SELECT target FROM <yourtable> ORDER BY target DESC LIMIT 50'). Make sure that the highest number in the source/target fields is more-or-less in line with the number of edges in the table. (It will probably be somewhat lower than the number of edges, actually.)
I had entered a dummy value like 9999999 in one of those fields for a couple edges as a quick fix during data cleanup. The queries ran correctly, but they were consuming a huge amount of memory. I spent a bunch of time trying to tune postgresql's memory parameters and then noticed an old post on this discussion forum that alluded to a similar problem. I changed the dummy values to more reasonable ones and my shortest_path queries now run in about 1/4 of the time (from about 2 seconds to 500ms).
Hope this helps someone!
Jordan ridethecity.com
-
Message #1001
Thanks a lot for your hint and sharing this with everyone! It's very helpful one.
You're right, if you have very high vertex numbers it may decrease performance a lot. Especially if you have lots of data.So renumbering source/target ID and keep those numbers low is a good idea. It's usually no issue if you run assign_vertex_id to create the network topology.
daniel08/04/09 09:22:44 (16 months ago)