So I'm trying to create a list of the 15 closest people in an array of varying sizes. The array will almost always be ~100 objects in size, but for the sake of testing, I'm trying to make it work with 10,000 (there may be need for the project to be scaled up to these numbers further down the line).
Currently, the method in place is to loop through the array of people, and calculate their proximity to the user based on both the person in question's and the user's latitude and longitude (the former of which is stored in the array). This is done using the haversine formulae and works well (though it does take ~500 milliseconds).
The problem however is that when run on a mobile device (a Samsung Galaxy S5 for the sake of this example), performance really suffers. The time taken for the S5 to sort through 10,000 records in order of how close they are to a pre-determined latitude and longitude is a staggering 1,500-1,600 milliseconds, an unacceptable delay for an app that will be doing many things either side of this process.
So my question is, am I missing some fundamentally more efficient means of sorting this list? Is there an alternative formulae available that is more efficient? Could I simply calculate the combined difference in Latitude and Longitude in .000001s and sort based on that?
Notes:
The user's location is variable, so proximities cannot be stored
I am aware that I'm asking a mobile CPU to perform 100,000,000 calculations in a short space of time and so this may be unavoidable
The method of sorting is the native JavaScript sort method, below is a simplified version of what I am doing to test these timings:
patientArray.sort(function(a, b)
{
return GetDistanceToPoint(a["Lat"], a["Lng"]) - GetDistanceToPoint(b["Lat"], b["Lng"]);
});
// Function to get the User's distance to a point
function GetDistanceToPoint(Latitude, Longitude)
{
// Check if the User's current Latitude and Longitude are available
if(currentLat && currentLng)
{
// Convert degrees to a radius
function degreeToRadius(degree)
{
return degree * (Math.PI/180)
}
// Variable to store radius of the Earth in Km
var earthRadius = 6371;
// Calculate the distance between the two points
var dLat = degreeToRadius(Latitude-currentLat);
var dLon = degreeToRadius(Longitude-currentLng);
var a = Math.sin(dLat/2) * Math.sin(dLat/2) + Math.cos(degreeToRadius(currentLat)) * Math.cos(degreeToRadius(Latitude)) * Math.sin(dLon/2) * Math.sin(dLon/2);
var c = 2 * Math.atan2(Math.sqrt(a), Math.sqrt(1-a));
var d = earthRadius * c;
return d;
}
return "-1";
}
It all has to be tested but here are some ideas that I would try.
For heavy use of trigonometric functions you can use lookup tables. This is always a good idea. For example precompute 360 (or more) values of sin() and for every sin(radians) in your code use sinTable[degrees].
(I say 360 values as an example because with that your index is an angle in degrees but any value will do and it all depends on what precision you need - it can have thousands of values if needed.)
Avoid unnecessary calculations. May seem obvious but people often write something like
x/(2*Math.PI)
instead ofx*A
whereA
(a better name of course) is computed once as1/(2*Math.PI)
.Memoize every value that you can, if it makes sense.
If your data have some specific qualities, like for example never spanning half of the planet, then you can try to cheat a little bit and use coordinates on a flat plane - then you only have to compute square roots (which could also be precomputed to use lookup tables).
Those are the first things that crossed my mind. Hope it helps.
UPDATE:
You made an edit so I know a little bit more now. Here are my tips:
Don't convert degrees to radians. Keep degrees and use them as indexes in lookup tables of precomputed values of trigonometric functions. If you need more precision then multiply the degrees by 10 or something and use a scale from 0 to 3600 instead of 0 to 360. Find a good size/precision compromise that works for you.
You can eliminate all sin() and cos() calls that way and if you're lucky you can eliminate atan2(). I wouldn't worry so much about sqrt() but you can eliminate it too if you know what the values are typically going to be. If values of functions like sqrt() or atan2() are not known up fron then you can fall back to real functions for values that are out of range of your lookup tables.
Avoid to many function calls. Instead of an anonymous function that you pass to patientArray.sort(), that calls GetDistanceToPoint(), that calls degreeToRadius() - you need only one function that can be passed directly as an argument to .sort() and that function doesn't need to return
d
- it can return justc
(in your example).You don't need to multiply everything by earthRadius if you only use that value for sorting.
Another quick ideas: using typed arrays (for lookup tables), and asm.js and SIMD.js for additional optimization if possible.
Those are the first things that come to mind. I'd love to hear how much faster your code can get. Good luck.
UPDATE 2:
Another idea - instead of (or in addition to) optimizing the GetDistanceToPoint() you can also make sure that it isn't called more than once for every object.
Instead of:
You can try doing something like:
or maybe for loop will be faster:
to store the value for the entire patientArray array.
And then in the sort function:
It may hopefully save you a lot of calls to GetDistanceToPoint.