prior to this patch we were allowing 256MB for every function run, just
because that was the default for the docker driver and we were not using the
memory field on any given route configuration. this fixes that, now docker
containers will get the correct memory limit passed into the container from
the route. the default is still 128.
there is also an env var now, `MEMORY_MB` that is set on each function call,
see the linked issue below for rationale.
closes#186
ran the given function code from #186, and now i only see allocations up to
32MB before the function is killed. yay.
notes:
there is no max for memory. for open source fn i'm not sure we want to
cap it, really. in the services repo we probably should add a cap before prod.
since we don't know any given fn server's ram, we can't try to make sure the
setting on any given route is something that can even be run.
remove envconfig & bytefmt
this updates the glide.yaml file to remove the unused deps, but trying to
install fresh is broken atm so i couldn't remove from vendor/, going to fix
separately (next update we just won't get these). also changed the skip dir to
be the cli dir now that its name has changed (related to brokenness).
fix how ram slots were being allocated. integer division is significantly
slower than subtraction.
this changes the behavior of hot containers:
1) we are no longer populating a hot container with all of the env vars from the
first request to start up that hot container. this will only populate the
container with any vars that are defined on the app or route.
2) when env vars are changed on the route or app, we will now start up a new
hot container that contains those changes.
3) fixes a bug where we could have a collision if the image and path name
created one, e.g. `/yo/foo` & `oze/yo:latest` collides with `/yo/fo` and
`ooze/yo:latest` (if all other fields are held constant), since we're
name spacing with app name in theory it would happen to the same user (though
we were relying on a comma delimiter there, not great). now we use NULL bytes
which should be hard to get in through a json api ;) i added a sha1 to keep
the size of the (soon to be very large) map down, i don't expect collisions
but, well, it's a hash function.
a small note that we could add a few things to the hot container that will not
change on a request basis, such as `app_name`, `format` and `route` but it's a
bit pedantic. ultimately, it's confusing imo that we have a different set of
vars in the env and in the request itself for hot, which is unavoidable unless
we choose to omit setting env vars entirely, but it seems to be what the
people want (lmk, people, if otherwise).
Each time when MQ becomes unreachable HTTP GET /tasks returned HTTP 500
and code was not handling this case except expecting networking errors.
After that it tried to unmarshal empty response body that caused another sort of an error.
This patch triggers error based on http response code, explicitly checking if response code
is something unexpected (not HTTP 200 OK).
Response status code for /tasks for changed from 202 Accepted to 200 OK according to swagger doc.
we had the inspect container here for 3 reasons:
1) get exit code
2) see if container is still running (debugging madness)
3) see if docker thinks it was an OOM
1) is something wait returns, but due to 2) and 3) we just delayed it until
inspection
2) was really just for debugging since we had 3)
3) seems unnecessary. to me, an OOM is an OOM is an OOM. so why have a whole
docker inspect call just to find out? (we could move this down, since it's a
sad path, and make the call only when necessary, but are we really getting any
value from this distinction anyway? i've never ran into it, myself)
inspect was actually causing tasks to time out, since the call to inspect
could put us over our task timeout, even though our container ran to
completion. we could have fixed this by checking the context earlier, but we
don't really need inspect either, which will reduce the docker calls we make,
which will make more unicorn puppers. now tasks should have more 'true'
timeouts.
tried to boy scout, but tracing patch also cleans this block up too.
we finally graduated high school and can make our own ramen
we no longer need this since fn appears to have no concept of canceling tasks
through an api we need to watch, and the context is plumbed if the request is
canceled. since tasks are short, we may never need to do cancellation of
running tasks like we had with iron worker. this was an added docker call
that's unnecessary since we are doing force removal of the container at the
end anyway.
we had this _almost_ right, in that we were trying, but we weren't masking the
error from the user response for any error we don't intend to show. this also
adds a stack trace from any internal server errors, so that we might be able
to track them down in the future (looking at you, 'context deadline
exceeded'). in addition, this adds a new `models.APIError` interface which all
of the errors in `models` now implement, and can be caught easily / added to
easily.
the front end now does no status rewriting based on api errors, now when we
get a non-nil error we can call `handleResponse(c, err)` with it and if it's a
proper error, return it to the user with the right status code, otherwise log
a stack trace and return `internal server error`. this cleans up a lot of the
front end code.
also rewrites start task ctx deadline exceeded as timeout. with iw we had
async tasks so we could start the clock later and it didn't matter, but now
with sync tasks time out sometimes just making docker calls, and we want the
task status to show up as timed out. we may want to just catch all this above
in addition to this, but this seems like the right thing to do.
remove squishing together errors. this was weird, now we return the first
error for the purposes of using the new err interface.
removed a lot of 5xx errors that really should have been 4xx errors. changed
some of the 400 errors to 409 errors, since they are from sending in
conflicting info and not a malformed request.
removed unused errors / useless errors (many were used for logging, and didn't
provide any context. now with stack traces we don't need context as much in
the logs).
this works by having every request from the functions server kick back a
FXLB-WAIT header on every request with the wait time for that function to
start. the lb then keeps track on a per node+function basis an ewma of the
last 10 request's wait times (to reduce jitter). now that we don't have max
concurrency it's actually pretty challenging to get the wait time stuff to
tick. i expect in the near future we will be throttling functions on a given
node in order to induce this, but that is for another day as that code needs a
lot of reworking. i tested this by introducing some arbitrary throttling (not
checked in) and load spreads over nodes correctly (see images). we will also
need to play with the intervals we want to use, as if you have a func with
50ms run time then basically 10 of those will rev up another node (this was
before removing max_c, with max_c=1) but in any event this wires in the basic
plumbing.
* make docs great again. renamed lb dir to fnlb
* added wait time to dashboard
* wires in a ready channel to await the first pull for hot images to count in
the wait time (should be otherwise useful)
future:
TODO rework lb code api to be pluggable + wire in data store
TODO toss out first data point containing pull to not jump onto another node
immediately (maybe this is actually a good thing?)
this patch gets rid of max concurrency for functions altogether, as discussed,
since it will be challenging to support across functions nodes. as a result of
doing so, the previous version of functions would fall over when offered 1000
functions, so there was some work needed in order to push this through.
further work is necessary as docker basically falls over when trying to start
enough containers at the same time, and with this patch essentially every
function can scale infinitely. it seems like we could add some kind of
adaptive restrictions based on task run length and configured wait time so
that fast running functions will line up to run in a hot container instead of
them all creating new hot containers.
this patch takes a first cut at whacking out some of the insanity that was the
previous concurrency model, which was problematic in that it limited
concurrency significantly across all functions since every task went through
the same unbuffered channel, which could create blocking issues for all
functions if the channel is not picked off fast enough (it's not apparent that
this was impossible in the previous implementation). in any event, each
request has a goroutine already, there's no reason not to use it. not too hard
to wrap a map in a lock, not sure what the benefits were (added insanity?) in effect
this is marginally easier to understand and less insane (marginally). after
getting rid of max c this adds a blocking mechanism for the first invocation
of any function so that all other hot functions will wait on the first one to
finish to avoid a herd issue (was making docker die...) -- this could be
slightly improved, but works in a pinch. reduced some memory usage by having
redundant maps of htfnsvr's and task.Requests (by a factor of 2!). cleaned up
some of the protocol stuff, need to clean this up further. anyway, it's a
first cut. have another patch that rewrites all of it but was getting into
rabbit hole territory, would be happy to oblige if anybody else has problems
understanding this rat's nest of channels. there is a good bit of work left to
make this prod ready (regardless of removing max c).
a warning that this will break the db schemas, didn't put the effort in to add
migration stuff since this isn't deployed anywhere in prod...
TODO need to clean out the htfnmgr bucket with LRU
TODO need to clean up runner interface
TODO need to unify the task running paths across protocols
TODO need to move the ram checking stuff into worker for noted reasons
TODO need better elasticity of hot f(x) containers