Commit Graph

957 Commits

Author SHA1 Message Date
Reed Allman
2ebc9c7480 hybrid mergy (#581)
* so it begins

* add clarification to /dequeue, change response to list to future proof

* Specify that runner endpoints are also under /v1

* Add a flag to choose operation mode (node type).

This is specified using the `FN_NODE_TYPE` environment variable. The
default is the existing behaviour, where the server supports all
operations (full API plus asynchronous and synchronous runners).

The additional modes are:
* API - the full API is available, but no functions are executed by the
  node. Async calls are placed into a message queue, and synchronous
  calls are not supported (invoking them results in an API error).
* Runner - only the invocation/route API is present. Asynchronous and
  synchronous invocation requests are supported, but asynchronous
  requests are placed onto the message queue, so might be handled by
  another runner.

* Add agent type and checks on Submit

* Sketch of a factored out data access abstraction for api/runner agents

* Fix tests, adding node/agent types to constructors

* Add tests for full, API, and runner server modes.

* Added atomic UpdateCall to datastore

* adds in server side endpoints

* Made ServerNodeType public because tests use it

* Made ServerNodeType public because tests use it

* fix test build

* add hybrid runner client

pretty simple go api client that covers surface area needed for hybrid,
returning structs from models that the agent can use directly. not exactly
sure where to put this, so put it in `/clients/hybrid` but maybe we should
make `/api/runner/client` or something and shove it in there. want to get
integration tests set up and use the real endpoints next and then wrap this up
in the DataAccessLayer stuff.

* gracefully handles errors from fn
* handles backoff & retry on 500s
* will add to existing spans for debuggo action

* minor fixes

* meh
2017-12-11 10:43:19 -08:00
CI
eb39c22bf5 fnserver: 0.3.228 release [skip ci] 2017-12-07 14:13:52 +00:00
CI
834e61cd9f fnserver: 0.3.227 release [skip ci] 2017-12-07 00:27:41 +00:00
Tolga Ceylan
9481f811b7 fn: fail count should include timeouts (#577)
* fn: fail count should include timeouts
2017-12-06 16:11:59 -08:00
CI
af08abd532 fnserver: 0.3.226 release [skip ci] 2017-12-07 00:07:48 +00:00
CI
485e679736 fnserver: 0.3.225 release [skip ci] 2017-12-06 21:52:24 +00:00
CI
04d1183bd9 fnserver: 0.3.224 release [skip ci] 2017-12-06 19:02:37 +00:00
CI
937e78d6a3 fnserver: 0.3.223 release [skip ci] 2017-12-06 18:57:17 +00:00
CI
af8b8d87a0 fnserver: 0.3.222 release [skip ci] 2017-12-06 18:28:57 +00:00
Travis Reeder
6b8627d1c5 Fixes to recent extension changes. (#568)
* Fixes to recent extension changes.

* Fixes issue where gin will continue calling the handler even if next() isn't called.

* Updated docs.
2017-12-06 10:12:55 -08:00
CI
bcc91dcdfa fnserver: 0.3.221 release [skip ci] 2017-12-06 17:31:50 +00:00
jan grant
e05afebed1 Nitfix for #548 (#571) 2017-12-06 09:15:16 -08:00
CI
96dda67bd9 fnserver: 0.3.220 release [skip ci] 2017-12-06 11:34:16 +00:00
CI
1e3edd45f0 fnserver: 0.3.219 release [skip ci] 2017-12-06 00:22:20 +00:00
CI
f89367f526 fnserver: 0.3.218 release [skip ci] 2017-12-05 18:42:23 +00:00
Nigel Deakin
96f27070be More metrics (#561)
* Add new spans to agent.submit

* Add new spans to agent.submit

* Add new spans to agent.submit

* Add new spans to agent.submit
2017-12-05 10:26:28 -08:00
CI
5dc6f164de fnserver: 0.3.217 release [skip ci] 2017-12-05 18:14:04 +00:00
CI
c501d3232f fnserver: 0.3.216 release [skip ci] 2017-12-05 16:39:14 +00:00
Travis Reeder
0798f9fac8 Middleware upgrade (#554)
* Adds root level middleware

* Added todo

* Better way for extensions to be added.

* Bad conflict merge?
2017-12-05 08:22:03 -08:00
CI
8dd1244ef3 fnserver: 0.3.215 release [skip ci] 2017-12-02 01:13:30 +00:00
Tolga Ceylan
dd88ec5d4e fn: sigterm graceful shutdown handling (#557) 2017-12-01 16:56:17 -08:00
CI
d2556ae2c7 fnserver: 0.3.214 release [skip ci] 2017-12-01 19:37:18 +00:00
CI
02f39749e3 fnserver: 0.3.213 release [skip ci] 2017-12-01 19:30:25 +00:00
Tolga Ceylan
25f6706642 Container memory tracking related changes (#541)
* squash# This is a combination of 10 commits2

fn: get available memory related changes

*) getAvailableMemory() improvements
*) early fail if requested memory too large to meet
*) tracking async and sync pools individually. Sync pool
is reserved for sync jobs only, while async pool can be
used by all jobs.
*) head room estimation for available memory in Linux.
2017-12-01 11:21:16 -08:00
CI
ae8c0d27be fnserver: 0.3.212 release [skip ci] 2017-12-01 18:10:52 +00:00
CI
3f63e613eb fnserver: 0.3.211 release [skip ci] 2017-12-01 17:51:52 +00:00
CI
06313fb9b0 fnserver: 0.3.210 release [skip ci] 2017-12-01 16:11:42 +00:00
CI
dfead4b0c2 fnserver: 0.3.209 release [skip ci] 2017-11-30 02:06:39 +00:00
Denis Makogon
5c68a88599 Fn-prefix everything (#545)
* Fn-prefix everything

Closes: #492

* Global replacement

* missed one fn_
2017-11-29 17:50:24 -08:00
CI
9b531bb675 fnserver: 0.3.208 release [skip ci] 2017-11-29 12:20:39 +00:00
Nigel Deakin
9a75785cbf Per route api extensions (#542)
* Extend extension mechanism to support per-route API extensions

* Tidy up comment

* Remove print statement

* Minor improvement to README

* Avoid calling c.Request.Context() twice
2017-11-29 12:03:23 +00:00
CI
209234960b fnserver: 0.3.207 release [skip ci] 2017-11-29 00:02:40 +00:00
Travis Reeder
a67d5a6290 Drop viper dependency (#550)
* Removed viper dependency.

* removed from glide files
2017-11-28 15:46:17 -08:00
CI
1529dff810 fnserver: 0.3.206 release [skip ci] 2017-11-28 18:29:50 +00:00
CI
3a2abbff28 fnserver: 0.3.205 release [skip ci] 2017-11-28 17:38:09 +00:00
Reed Allman
892c843d87 add error to call model (#539)
* add error to call model

closes #331

previously, for async this error was being masked completely even if it was
something useful like the image not existing. for sync, the error was returned
in the http request but now it's also being stored. this error itself can
cover a lot of landscape, it could be an error in getting a slot, pulling an
image, running a container, among other things. anyway, no longer being
masked. we can likely improve it in certain cases we run into in the future,
but it's open ended at the moment and not being masked like some errors in
sync http request returns (503 non-models.APIError) for now.

* tucks in callTrigger stuff to keep api clean
* adds swagger
* adds migration
* adds tests for datastore and agent to ensure behavior

* pull images before tests are ran

* gofmt migrations file
2017-11-28 11:21:39 -06:00
CI
adb9872921 fnserver: 0.3.204 release [skip ci] 2017-11-28 16:32:54 +00:00
Nigel Deakin
954f69e74a Add appname to basic metrics (#547)
* Add app labels to queued/running/completed/failed metrics

* Add app labels to queued/running/completed/failed metrics

* Add app labels to queued/running/completed/failed metrics
2017-11-28 10:17:24 -06:00
CI
93ab1f0bc2 fnserver: 0.3.203 release [skip ci] 2017-11-27 15:09:20 +00:00
Reed Allman
c9198b8525 add per call stats field as histogram (#528)
* add per call stats field as histogram

this will add a histogram of up to 240 data points of call data, produced
every second, stored at the end of a call invocation in the db. the same
metrics are also still shipped to prometheus (prometheus has the
not-potentially-reduced version). for the API reference, see the updates to
the swagger spec, this is just added onto the get call endpoint.

this does not add any extra db calls and the field for stats in call is a json
blob, which is easily modified to add / omit future fields. this is just
tacked on to the call we're making to InsertCall, and expect this to add very
little overhead; we are bounding the set to be relatively small, planning to
clean out the db of calls periodically, functions will generally be short, and
the same code used at a previous firm did not cause a notable db size increase
with production workload that is worse, wrt histogram size (I checked). the
code changes are really small aside from changing to strfmt.DateTime,
adding a migration and implementing sql.Valuer; needed to slightly modify the
swap function so that we can safely read `call.Stats` field to upload at end.

with the full histogram in hand, we can compute max/min/average/median/growth
rate/bernoulli distributions/whatever very easily in a UI or tooling. in
particular, this data is easily chartable [for a UI], which is beneficial.

* adds swagger spec of api update to calls endpoint
* adds migration for call.stats field
* adds call.stats field to sql queries
* change swapping of hot logger to exec, so we know that call.Stats is no
longer being modified after `exec` [in call.End]
* throws out docker stats between function invocations in hot functions (no
call to store them on, we could change this later for debug; they're in prom)
* tested in tests and API

closes #19

* add format of ints to swag
2017-11-27 08:52:53 -06:00
CI
3adf478530 fnserver: 0.3.202 release [skip ci] 2017-11-24 16:48:55 +00:00
Denis Makogon
347edea56e Use valid call type instead in protocol (#534) 2017-11-24 10:32:17 -06:00
CI
19df15bd9b fnserver: 0.3.201 release [skip ci] 2017-11-22 20:17:21 +00:00
Tolga Ceylan
89dc79f0b0 fn: remove redundant httprouter code (#532)
*) tree from https://github.com/julienschmidt/httprouter
is already in Gin and this only seems to be parsing
parameters from URI.
2017-11-22 13:58:10 -06:00
CI
c2a0f83467 fnserver: 0.3.200 release [skip ci] 2017-11-22 15:43:11 +00:00
CI
7548ab44d5 fnserver: 0.3.199 release [skip ci] 2017-11-21 20:57:31 +00:00
Tolga Ceylan
2551be446a fn: introducing 503 responses for out of capacity case (#518)
* fn: introducing 503 responses for out of capacity case

*) Adding 503 with Retry-After header case if request failed
during waiting for slots.
*) TODO: return 503 without Retry-After if the request can
never be met by this fn server.
*) fn: runner test docker pull fixup
*) fn: MaxMemory for routes is now a variable to allow
testing and adjusting it according to fleet memory sizes.
2017-11-21 12:42:02 -08:00
CI
cd80ae0a3a fnserver: 0.3.198 release [skip ci] 2017-11-21 19:00:20 +00:00
CI
7764f99a63 fnserver: 0.3.197 release [skip ci] 2017-11-21 01:55:54 +00:00
Reed Allman
2d8c528b48 S3 loggyloo (#511)
* add minio-go dep, update deps

* add minio s3 client

minio has an s3 compatible api and is an open source project and, notably, is
not amazon, so it seems best to use their client (fwiw the aws-sdk-go is a
giant hair ball of things we don't need, too). it was pretty easy and seems
to work, so rolling with it. also, minio is a totally feasible option for fn
installs in prod / for demos / for local.

* adds 's3' package for s3 compatible log storage api, for use with storing
logs from calls and retrieving them.
* removes DELETE /v1/apps/:app/calls/:call/log endpoint
* removes internal log deletion api
* changes the GetLog API to use an io.Reader, which is a backwards step atm
due to the json api for logs, I have another branch lined up to make a plain
text log API and this will be much more efficient (also want to gzip)
* hooked up minio to the test suite and fixed up the test suite
* add how to run minio docs and point fn at it docs

some notes: notably we aren't cleaning up these logs. there is a ticket
already to make a Mr. Clean who wakes up periodically and nukes old stuff, so
am punting any api design around some kind of TTL deletion of logs. there are
a lot of options really for Mr. Clean, we can notably defer to him when apps
are deleted, too, so that app deletion is fast and then Mr. Clean will just
clean them up later (seems like a good option).

have not tested against BMC object store, which has an s3 compatible API. but
in theory it 'just works' (the reason for doing this). in any event, that's
part of the service land to figure out.

closes #481
closes #473

* add log not found error to minio land
2017-11-20 17:39:45 -08:00