location: add all.py, using takeout/gpslogger/ip #237

purarue · 2022-04-26T01:29:30Z

Should resolve #154

Zulip convo: https://memex.zulipchat.com/#narrow/stream/279601-hpi/topic/locations

Still need to update the docs, but code has been updated -- can take a look at this whenever you're free

CI may fail, made quite a few updates

Still TODO:

filter locations in my.location.via_location._locations based on my.config.time.tz.via_location.require_accuracy

my/time/tz/via_location.py

purarue · 2022-04-26T04:11:11Z

I think thats everything I could think of, let me know if I should change anything

purarue · 2022-04-26T05:36:20Z

Also, not using geopy anywhere, that was used in my.location.google but didnt want to copy that to everywhere, perhaps a separate my.location.geo could import from my.location.all to enrich the data, if you want elevation/additional geolocation info?

otherwise feels like its polluting all the other sources and would be hard to cache/keep efficient

purarue · 2022-04-26T07:25:06Z

Ah, I guess this section assumed that the dates were sorted, but thats not really possible since my.location.all could have any order of dates, from multiple sources.

It works when you just have the one google source, but since this combines a bunch of (possibly unknown and unsorted), its dropping a bunch of data there....

Guess the check for 'local time goes backwards' should just be removed -- the way Im 'solving' this is to sort the DayWithZones, in _iter_tzs, so at least we can group by sorted locations (as thats required)

purarue · 2022-04-26T07:28:54Z

my/time/tz/via_location.py

            continue
        tz = pytz.timezone(zone)
        # TODO this is probably a bit expensive... test & benchmark
-        ldt = l.dt.astimezone(tz)
+        ldt = dt.astimezone(tz)
        ndate = ldt.date()


I think this section comparing to the previous date received from locations may need to be removed? Since we cant control the order of my.location

So, this will be slower, but thats the trade off for allowing any sort of location data as input

Its not trivial to cache the my.location.all function either, since it depends on any amount of inputs, so creating an accuracte cachew depends_on would be difficult

Hmm yeah, I mean in principle we could either demand the upstream providers to provide it as sorted (and for many it'll already be, like google takeout?).
Or alternatively sort before merging them in all (via a helper merge function), then the whole thing would be much more iterative. But I guess can think about it later

Or could even do some magic, e.g. function which takes multiple generator factories (functions () -> Iterator), assumes that they are emitting sorted values, and tries merging with https://docs.python.org/3/library/heapq.html#heapq.merge

If something out of order is detected, it starts from scratch with the offending factory wrapped in sorted (hence merging factories, not plain Iterators). I guess realistically it would very quickly lead to detecting which iterables are already sorted and which aren't, usually the order is either increasing, decreasing or just completely random?

Just leaving it here, but potentially might be useful for other data sources too, e.g. often it's kinda nice to get things according to timestamps

Yeah, use a heapmerge would make sense in situations like these, but I still think it might be prone to errors because a user might just overwrite all.py with something else (like I've done here) and then the magic doesn't work anymore. As usual, the magic might make it nice for us, but its not super extendible -- I feel like changing an all.py file shouldnt break a separate module

For now I think I'll go with the iterative approach of sorting the _locations in _iter_tzs -- since its cached when its used repeatedly

Could possibly add a sorted or geo location modules in the future which sort and use geopy to return more info about the locations

Another idea could be in all.py modules to specify the sources as a SOURCES = [ ] key at the toplevel, so then we could actually do some sort of magic and heapsort the iterators by querying that, instead of doing it in all.py itself. Otherwise, my.location.all.locations() is just a function and no way to get what the underlying iterators are. If the SOURCES var doesn't exist, then fallback to sorting the giant list of locations iteratively

And then locations in my.location.all would just be:

for provider in SOURCES: yield from provider()

karlicoss

Looks good, thanks! A few discussion points

doc/MODULES.org

my/ip/common.py

my/location/gpslogger.py

my/location/common.py

my/time/tz/via_location.py

my/ip/all.py

karlicoss · 2022-04-26T18:46:44Z

my/time/tz/via_location.py

            continue
        tz = pytz.timezone(zone)
        # TODO this is probably a bit expensive... test & benchmark
-        ldt = l.dt.astimezone(tz)
+        ldt = dt.astimezone(tz)
        ndate = ldt.date()


Hmm yeah, I mean in principle we could either demand the upstream providers to provide it as sorted (and for many it'll already be, like google takeout?).
Or alternatively sort before merging them in all (via a helper merge function), then the whole thing would be much more iterative. But I guess can think about it later

purarue · 2022-04-26T19:55:37Z

believe thats all the changes -- _iter_tz now works for me properly (previously was dropping a bunch of locations since it was out of order)

Would probably be best to squash this whole thing when merging

karlicoss · 2022-04-26T20:10:53Z

nice! thanks

location: add all.py, using takeout/gpslogger/ip

ca10d52

purarue commented Apr 26, 2022

View reviewed changes

my/time/tz/via_location.py Outdated Show resolved Hide resolved

purarue added 6 commits April 25, 2022 18:39

prevent scope from causing incompatible type assigments

51c4b79

remove InputSource

f8d891c

update tz/via_location tests

749f488

check required accuracy when computing timezones

cca439a

update docs

f9d2f6e

fix module headings in doc

e3cc095

locations/takeout: remove useless exception check

cbe66f9

purarue mentioned this pull request Apr 26, 2022

modularize my.location and remove tz.time.via_location purarue/HPI#19

Closed

purarue added 2 commits April 25, 2022 21:49

remove broken code block

15fcec6

add comment about adding/removing sources

a417e5c

purarue mentioned this pull request Apr 26, 2022

merge into HPI? purarue/HPI#17

Open

via_location: sort dates before groupby

ca138a0

purarue commented Apr 26, 2022

View reviewed changes

purarue added 2 commits April 26, 2022 11:27

doc: remove dataclass decorator from config docs

a8e6d71

update default tz policy in docs

8ab67d4

karlicoss reviewed Apr 26, 2022

View reviewed changes

purarue added 4 commits April 26, 2022 12:14

tz -> zone, update iter_tzs depends on, misc changes

62dd671

rename policy -> tzpolicy

bec5272

make via_location stats a bit more configurable

61e84bd

add _sorted_locations to iteratively sort

c7dc29e

karlicoss merged commit 2cb8361 into karlicoss:master Apr 26, 2022

purarue deleted the location-all-provider branch April 26, 2022 21:09

purarue mentioned this pull request Jan 13, 2023

location fallback system #262

Closed

location: add all.py, using takeout/gpslogger/ip #237

location: add all.py, using takeout/gpslogger/ip #237

Uh oh!

Conversation

purarue commented Apr 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

purarue commented Apr 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

purarue commented Apr 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

purarue commented Apr 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

purarue Apr 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

karlicoss Apr 26, 2022

Choose a reason for hiding this comment

Uh oh!

karlicoss Apr 26, 2022

Choose a reason for hiding this comment

Uh oh!

purarue Apr 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

purarue Apr 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

karlicoss left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

karlicoss Apr 26, 2022

Choose a reason for hiding this comment

Uh oh!

purarue commented Apr 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

karlicoss commented Apr 26, 2022

Uh oh!

Uh oh!

purarue commented Apr 26, 2022 •

edited

Loading

purarue commented Apr 26, 2022 •

edited

Loading

purarue commented Apr 26, 2022 •

edited

Loading

purarue commented Apr 26, 2022 •

edited

Loading

purarue Apr 26, 2022 •

edited

Loading

purarue Apr 26, 2022 •

edited

Loading

purarue Apr 26, 2022 •

edited

Loading

purarue commented Apr 26, 2022 •

edited

Loading