we had this _almost_ right, in that we were trying, but we weren't masking the
error from the user response for any error we don't intend to show. this also
adds a stack trace from any internal server errors, so that we might be able
to track them down in the future (looking at you, 'context deadline
exceeded'). in addition, this adds a new `models.APIError` interface which all
of the errors in `models` now implement, and can be caught easily / added to
easily.
the front end now does no status rewriting based on api errors, now when we
get a non-nil error we can call `handleResponse(c, err)` with it and if it's a
proper error, return it to the user with the right status code, otherwise log
a stack trace and return `internal server error`. this cleans up a lot of the
front end code.
also rewrites start task ctx deadline exceeded as timeout. with iw we had
async tasks so we could start the clock later and it didn't matter, but now
with sync tasks time out sometimes just making docker calls, and we want the
task status to show up as timed out. we may want to just catch all this above
in addition to this, but this seems like the right thing to do.
remove squishing together errors. this was weird, now we return the first
error for the purposes of using the new err interface.
removed a lot of 5xx errors that really should have been 4xx errors. changed
some of the 400 errors to 409 errors, since they are from sending in
conflicting info and not a malformed request.
removed unused errors / useless errors (many were used for logging, and didn't
provide any context. now with stack traces we don't need context as much in
the logs).
replace default bolt option with sqlite3 option. the story here is that we
just need a working out of the box solution, and sqlite3 is just fine for that
(actually, likely better than bolt).
with sqlite3 supplanting bolt, we mostly have sql databases. so remove redis
and then we just have one package that has a `sql` implementation of the
`models.Datastore` and lean on sqlx to do query rewriting. this does mean
queries have to be formed a certain way and likely have to be ANSI-SQL (no
special features) but we weren't using them anyway and our base api is
basically done and we can easily extend this api as needed to only implement
certain methods in certain backends if we need to get cute.
* remove bolt & redis datastores (can still use as mqs)
* make sql queries work on all 3 (maybe?)
* remove bolt log store and use sqlite3
* shove the FnLog shit into the datastore shit for now (free pg/mysql logs...
just for demos, etc, not prod)
* fix up the docs to remove bolt references
* add sqlite3, sqlx dep
* fix up tests & mock stuff, make validator less insane
* remove put & get in datastore layer as nobody is using.
this passes tests which at least seem like they test all the different
backends. if we trust our tests then this seems to work great. (tests `make
docker-test-run-with-*` work now too)
Fixes#64
Previously calling a root registered route would result in an error
message "Not Found" suggesting the route hadn't been registed, yet when
listing the routes, `fn routes list myapp` you could see the `/` route.
You can now successfully call a root registered route with `fn call
myapp /`
this patch gets rid of max concurrency for functions altogether, as discussed,
since it will be challenging to support across functions nodes. as a result of
doing so, the previous version of functions would fall over when offered 1000
functions, so there was some work needed in order to push this through.
further work is necessary as docker basically falls over when trying to start
enough containers at the same time, and with this patch essentially every
function can scale infinitely. it seems like we could add some kind of
adaptive restrictions based on task run length and configured wait time so
that fast running functions will line up to run in a hot container instead of
them all creating new hot containers.
this patch takes a first cut at whacking out some of the insanity that was the
previous concurrency model, which was problematic in that it limited
concurrency significantly across all functions since every task went through
the same unbuffered channel, which could create blocking issues for all
functions if the channel is not picked off fast enough (it's not apparent that
this was impossible in the previous implementation). in any event, each
request has a goroutine already, there's no reason not to use it. not too hard
to wrap a map in a lock, not sure what the benefits were (added insanity?) in effect
this is marginally easier to understand and less insane (marginally). after
getting rid of max c this adds a blocking mechanism for the first invocation
of any function so that all other hot functions will wait on the first one to
finish to avoid a herd issue (was making docker die...) -- this could be
slightly improved, but works in a pinch. reduced some memory usage by having
redundant maps of htfnsvr's and task.Requests (by a factor of 2!). cleaned up
some of the protocol stuff, need to clean this up further. anyway, it's a
first cut. have another patch that rewrites all of it but was getting into
rabbit hole territory, would be happy to oblige if anybody else has problems
understanding this rat's nest of channels. there is a good bit of work left to
make this prod ready (regardless of removing max c).
a warning that this will break the db schemas, didn't put the effort in to add
migration stuff since this isn't deployed anywhere in prod...
TODO need to clean out the htfnmgr bucket with LRU
TODO need to clean up runner interface
TODO need to unify the task running paths across protocols
TODO need to move the ram checking stuff into worker for noted reasons
TODO need better elasticity of hot f(x) containers
* API endpoint extensions working.
extensions example.
extensions example.
* Added server.NewEnv and some docs for the API extensions example.
extensions example.
extensions example.
* Uncommented special handler stuff.
* First example of middleware.
easier to use.
* Added a special Middleware context to make middleware easier to use.
* Fix tests.
* Cleanup based on PR comments.
* API endpoint extensions working.
extensions example.
* Added server.NewEnv and some docs for the API extensions example.
extensions example.
example main.go.
* Uncommented special handler stuff.
* Added section in docs for extending API linking to example main.go.
* Commented out special_handler test
* Changed to NewFromEnv
* Add global lru for routes with keys being the appname + path
* minor comment fixes
* remove duplicate entires from THIRD_PARTY
* Make sure that we lock and unlock on get, refresh and delete on the cache
* functions: modify datastore to accomodate hot containers support
* functions: protocol between functions and hot containers
* functions: add hot containers clockwork
* fn: add hot containers support
* ctx middleware should always be the first added to router
* plugable enqueue func, changed server.New signature
* fix tests
* remove ctx/ctx.Done from server
* functions: add bounded concurrency
* functions: plug runners to sync and async interfaces
* functions: update documentation about the new env var
* functions: fix test flakiness
* functions: the runner is self-regulated, no need to set a number of runners
* functions: push the execution to the background on incoming requests
* functions: ensure async tasks are always on
* functions: add prioritization to tasks consumption
Ensure that Sync tasks are consumed before Async tasks. Also, fixes
termination races problems for free.
* functions: remove stale comments
* functions: improve mem availability calculation
* functions: parallel run for async tasks
* functions: check for memory availability before pulling async task
* functions: comment about rnr.hasAvailableMemory and sync.Cond
* functions: implement memory check for async runners using Cond vars
* functions: code grooming
- remove unnecessary goroutines
- fix stale docs
- reorganize import group
* Revert "functions: implement memory check for async runners using Cond vars"
This reverts commit 922e64032201a177c03ce6a46240925e3d35430d.
* Revert "functions: comment about rnr.hasAvailableMemory and sync.Cond"
This reverts commit 49ad7d52d341f12da9603b1a1df9d145871f0e0a.
* functions: set a minimum memory availability for sync
* functions: simplify the implementation by removing the priority queue
* functions: code grooming
- code deduplication
- review waitgroups Waits
Currently, async workers are started before HTTP interface is available
to get their requests. It fixes by ensuring that async workers are
started after HTTP interface is up.
Essentially we are getting rid of an error message during bootstrap:
ERRO[0000] Could not fetch task error=Get http://127.0.0.1:8080/tasks: dial tcp 127.0.0.1:8080: getsockopt: connection refused