fn-serverless

mirror of https://github.com/fnproject/fn.git synced 2022-10-28 21:29:17 +03:00

Author	SHA1	Message	Date
Tolga Ceylan	5dc5740a54	fn: runner status and docker load images (#1116 ) * fn: runner status and docker load images Introducing a function run for pure runner Status calls. Previously, Status gRPC calls returned active inflight request counts with the purpose of a simple health checker. However this is not sufficient since it does not show if agent or docker is healthy. With this change, if pure runner is configured with a status image, that image is executed through docker. The call uses zero memory/cpu/tmpsize settings to ensure resource tracker does not block it. However, operators might not always have a docker repository accessible/available for status image. Or operators might not want the status to go over the network. To allow such cases, and in general possibly caching docker images, added a new environment variable FN_DOCKER_LOAD_FILE. If this is set, fn-agent during startup will load these images that were previously saved with 'docker save' into docker.	2018-07-12 13:58:38 -07:00
Tom Coupland	c7a50efd2c	Plan for func.yaml file changes for triggers. fnproject/cli#324 (#1115 ) The changes aim to be as minimal as possible: * Remove no longer available options * Add a schema version to the file for validation * Add trigger list block	2018-07-11 15:05:27 +01:00
Tolga Ceylan	300fcd7d92	fn: applications should be aware of reserved writable space (#1083 ) Similar to FN_MEMORY, we pass FN_TMPSIZE to function config.	2018-07-03 16:04:48 -07:00
Reed Allman	35e5f81bc8	add docs for --net=host hackery (#1105 ) I've found this to be extremely useful. Not that I expect anyone to be able to find this document on their own accord considering the breadth of documentation that we have, this can still be useful for linking to from slack at least (what docs are really for, right?) also the triggers doc stuck out as confusing considering all the triggers stuff going on, I was unable to comprehend how exactly it was helpful other than making people aware that openstack exists and they could build an extension into fn for it if they want to, but this seems true of most things? so, removed it, if anyone objects maybe we could improve it a little?	2018-07-02 11:40:00 -07:00
Owen Cliffe	128c9a5182	Fix verbs in docs (#1097 ) * Fix verbs [skip ci] * run ci	2018-07-02 07:09:42 -07:00
Tolga Ceylan	d0365bd2c9	fn: update route documentation for tmpfs_size (#1104 )	2018-06-29 18:26:57 -07:00
Tolga Ceylan	3b98c19220	fn: swagger update for tmpfs size (#1034 )	2018-06-29 16:03:58 -07:00
Tolga Ceylan	974a8d6f06	fn: add explanation of read-only disk and /tmp in faq/persistence. (#1103 )	2018-06-29 12:39:26 -07:00
Rik Gibson	64fb6d27b4	Fixed up a couple of incorrect response codes (#1095 ) * Fixed up a couple of incorrect response codes * Standardise all entities on 204 with no return content on successful delete * Fix failing Fn.delete() test	2018-06-26 18:17:47 +01:00
Tom Coupland	88a674b24b	Removing Swaggerv2 error wrapping (#1092 ) The code does not produce errors in this shape.	2018-06-26 12:11:36 +01:00
Rik Gibson	4b67b0ed26	Tidying and rationalising summary and description strings, some minor… (#1090 ) * Tidying and rationalising summary and description strings, some minor rejigging of elements for consistency. * Removed a single solitary fullstop. * Quoted a few more unquoted strings... * Capitalized summaries, no full stops. Sentences for descriptions, yes full stops. * More Capitals, less full stops...	2018-06-26 00:33:13 +01:00
Tom Coupland	3ebff051a4	Add support for Function and Trigger domain objects (#1060 ) Vast commit, includes: * Introduces the Trigger domain entity. * Introduces the Fns domain entity. * V2 of the API for interacting with the new entities in swaggerv2.yml * Adds v2 end points for Apps to support PUT updates. * Rewrites the datastore level tests into a new pattern. * V2 routes use entity ID over name as the path parameter.	2018-06-25 15:37:06 +01:00
Sachin Pikle	abd8580300	Fixed broken links (#1087 )	2018-06-25 18:26:19 +05:30
Owen Cliffe	456cbed8bd	Update CLI docs to reflect new CLI verb/noun structure (#1031 ) Use new CLI syntax	2018-06-08 11:47:04 +01:00
Tolga Ceylan	d190167580	fn: read-only root fs becomes default (#1019 ) * fn: read-only root fs becomes default Set root fs as read-only by default. * fn: update doc for FN_DISABLE_READONLY_ROOTFS	2018-05-30 18:17:28 -07:00
Reed Allman	cbe0d5e9ac	add user syslog writers to app (#970 ) * add user syslog writers to app users may specify a syslog url[s] on apps now and all functions under that app will spew their logs out to it. the docs have more information around details there, please review those (swagger and operating/logging.md), tried to implement to spec in some parts and improve others, open to feedback on format though, lots of liberty there. design decision wise, I am looking to the future and ignoring cold containers. the overhead of the connections there will not be worth it, so this feature only works for hot functions, since we're killing cold anyway (even if a user can just straight up exit a hot container). syslog connections will be opened against a container when it starts up, and then the call id that is logged gets swapped out for each call that goes through the container, this cuts down on the cost of opening/closing connections significantly. there are buffers to accumulate logs until we get a `\n` to actually write a syslog line, and a buffer to save some bytes when we're writing the syslog formatting as well. underneath writers re-use the line writer in certain scenarios (swapper). we could likely improve the ease of setting this up, but opening the syslog conns against a container seems worth it, and is a different path than the other func loggers that we create when we make a call object. the Close() stuff is a little tricky, not sure how to make it easier and have the ^ benefits, open to idears. this does add another vector of 'limits' to consider for more strict service operators. one being how many syslog urls can a user add to an app (infinite, atm) and the other being on the order of number of containers per host we could run out of connections in certain scenarios. there may be some utility in having multiple syslog sinks to send to, it could help with debugging at times to send to another destination or if a user is a client w/ someone and both want the function logs, e.g. (have used this for that in the past, specifically). this also doesn't work behind a proxy, which is something i'm open to fixing, but afaict will require a 3rd party dependency (we can pretty much steal what docker does). this is mostly of utility for those of us that work behind a proxy all the time, not really for end users. there are some unit tests. integration tests for this don't sound very fun to maintain. I did test against papertrail with each protocol and it works (and even times out if you're behind a proxy!). closes #337 * add trace to syslog dial	2018-05-15 11:00:26 -07:00
Tomas Knappek	19f09b3a6c	Added FN_API_CORS_HEADERS for configuring CORS headers (#997 )	2018-05-15 18:03:01 +01:00
Tolga Ceylan	a69a930eb8	fn: doc update for FN_DOCKER_NETWORKS (#984 )	2018-05-09 14:53:01 -07:00
Reed Allman	7607c003cf	improve private auth docs a little (#971 )	2018-05-03 12:52:58 -07:00
Travis Reeder	977976fa52	Update cloudevents.md	2018-04-24 11:15:01 -07:00
Travis Reeder	3eb60e2028	CloudEvents I/O format support. (#948 ) * CloudEvents I/O format support. * Updated format doc. * Remove log lines * This adds support for CloudEvent ingestion at the http router layer. * Updated per comments. * Responds with full CloudEvent message. * Fixed up per comments * Fix tests * Checks for cloudevent content-type * doesn't error on missing content-type.	2018-04-23 16:05:13 -07:00
Tolga Ceylan	09bfa41da5	fn: file system 'size' option documentation (#889 )	2018-03-26 21:30:39 -07:00
Denis Makogon	3c15ca6ea6	App ID (#641 ) * App ID * Clean-up * Use ID or name to reference apps * Can use app by name or ID * Get rid of AppName for routes API and model routes API is completely backwards-compatible routes API accepts both app ID and name * Get rid of AppName from calls API and model * Fixing tests * Get rid of AppName from logs API and model * Restrict API to work with app names only * Addressing review comments * Fix for hybrid mode * Fix rebase problems * Addressing review comments * Addressing review comments pt.2 * Fixing test issue * Addressing review comments pt.3 * Updated docstring * Adjust UpdateApp SQL implementation to work with app IDs instead of names * Fixing tests * fmt after rebase * Make tests green again! * Use GetAppByID wherever it is necessary - adding new v2 endpoints to keep hybrid api/runner mode working - extract CallBase from Call object to expose that to a user (it doesn't include any app reference, as we do for all other API objects) * Get rid of GetAppByName * Adjusting server router setup * Make hybrid work again * Fix datastore tests * Fixing tests * Do not ignore app_id * Resolve issues after rebase * Updating test to make it work as it was * Tabula rasa for migrations * Adding calls API test - we need to ensure we give "App not found" for the missing app and missing call in first place - making previous test work (request missing call for the existing app) * Make datastore tests work fine with correctly applied migrations * Make CallFunction middleware work again had to adjust its implementation to set app ID before proceeding * The biggest rebase ever made * Fix 8's migration * Fix tests * Fix hybrid client * Fix tests problem * Increment app ID migration version * Fixing TestAppUpdate * Fix rebase issues * Addressing review comments * Renew vendor * Updated swagger doc per recommendations	2018-03-26 11:19:36 -07:00
Owen Cliffe	d25b5af59d	Add annotations to routes and apps (#866 ) Adds 'annotations' attribute to Routes and Apps	2018-03-20 18:02:49 +00:00
Shaun Smith	795f37f1bd	Fix to broken CLI link. (#868 ) * Fix to broken CLI link. * Point to install CLI link in the Fn readme file	2018-03-19 17:42:18 +05:30
Gerardo Viedma	73ae77614c	Moves out node pool manager behind an extension using runner pool abstraction (Part 2) (#862 ) * Move out node-pool manager and replace it with RunnerPool extension * adds extension points for runner pools in load-balanced mode * adds error to return values in RunnerPool and Runner interfaces * Implements runner pool contract with context-aware shutdown * fixes issue with range * fixes tests to use runner abstraction * adds empty test file as a workaround for build requiring go source files in top-level package * removes flappy timeout test * update docs to reflect runner pool setup * refactors system tests to use runner abstraction * removes poolmanager * moves runner interfaces from models to api/runnerpool package * Adds a second runner to pool docs example * explicitly check for request spillover to second runner in test * moves runner pool package name for system tests * renames runner pool pointer variable for consistency * pass model json to runner * automatically cast to http.ResponseWriter in load-balanced call case * allow overriding of server RunnerPool via a programmatic ServerOption * fixes return type of ResponseWriter in test * move Placer interface to runnerpool package * moves hash-based placer out of open source project * removes siphash from Gopkg.lock	2018-03-16 13:46:21 +00:00
Reed Allman	9eaf824398	add jaeger support, link hot container & req span (#840 ) * add jaeger support, link hot container & req span * adds jaeger support now with FN_JAEGER_URL, there's a simple tutorial in the operating/metrics.md file now and it's pretty easy to get up and running. * links a hot request span to a hot container span. when we change this to sample at a lower ratio we'll need to finagle the hot container span to always sample or something, otherwise we'll hide that info. at least, since we're sampling at 100% for now if this is flipped on, can see freeze/unfreeze etc. if they hit. this is useful for debugging. note that zipkin's exporter does not follow the link at all, hence jaeger... and they're backed by the Cloud Empire now (CNCF) so we'll probably use it anyway. * vendor: add thrift for jaeger	2018-03-13 15:57:12 -07:00
Matt Stephenson	924fe2b72b	Fix missing instructions for building noop.so (#846 )	2018-03-12 14:57:56 -07:00
Matt Stephenson	a787ccac36	Refactor controlplane into a go plugin (#833 ) * Refactor controlplane into a go plugin * Move vbox to controlplane package	2018-03-12 12:50:55 -07:00
Reed Allman	6967d0bfcb	json format __definition__ omit whitespace between objects (#835 ) http://json.org/ says: `Whitespace can be inserted between any pair of tokens. Excepting a few encoding details, that completely describes the language.` we do not explicitly need the whitespace between objects in our json, it's entirely optional and soon we will even support it (#830)!	2018-03-09 12:06:31 -08:00
Gerardo Viedma	8af57da7b2	Support load-balanced runner groups for multitenant compute isolation (#814 ) * Initial stab at the protocol * initial protocol sketch for node pool manager * Added http header frame as a message * Force the use of WithAgent variants when creating a server * adds grpc models for node pool manager plus go deps * Naming things is really hard * Merge (and optionally purge) details received by the NPM * WIP: starting to add the runner-side functionality of the new data plane * WIP: Basic startup of grpc server for pure runner. Needs proper certs. * Go fmt * Initial agent for LB nodes. * Agent implementation for LB nodes. * Pass keys and certs to LB node agent. * Remove accidentally left reference to env var. * Add env variables for certificate files * stub out the capacity and group membership server channels * implement server-side runner manager service * removes unused variable * fixes build error * splits up GetCall and GetLBGroupId * Change LB node agent to use TLS connection. * Encode call model as JSON to send to runner node. * Use hybrid client in LB node agent. This should provide access to get app and route information for the call from an API node. * More error handling on the pure runner side * Tentative fix for GetCall problem: set deadlines correctly when reserving slot * Connect loop for LB agent to runner nodes. * Extract runner connection function in LB agent. * drops committed capacity counts * Bugfix - end state tracker only in submit * Do logs properly * adds first pass of tracking capacity metrics in agent * maked memory capacity metric uint64 * maked memory capacity metric uint64 * removes use of old capacity field * adds remove capacity call * merges overwritten reconnect logic * First pass of a NPM Provide a service that talks to a (simulated) CP. - Receive incoming capacity assertions from LBs for LBGs - expire LB requests after a short period - ask the CP to add runners to a LBG - note runner set changes and readvertise - scale down by marking runners as "draining" - shut off draining runners after some cool-down period * add capacity update on schedule * Send periodic capcacity metrics Sending capcacity metrics to node pool manager * splits grpc and api interfaces for capacity manager * failure to advertise capacity shouldn't panic * Add some instructions for starting DP/CP parts. * Create the poolmanager server with TLS * Use logrus * Get npm compiling with cert fixups. * Fix: pure runner should not start async processing * brings runner, nulb and npm together * Add field to acknowledgment to record slot allocation latency; fix a bug too * iterating on pool manager locking issue * raises timeout of placement retry loop * Fix up NPM Improve logging Ensure that channels etc. are actually initialised in the structure creation! * Update the docs - runners GRPC port is 9120 * Bugfix: return runner pool accurately. * Double locking * Note purges as LBs stop talking to us * Get the purging of old LBs working. * Tweak: on restart, load runner set before making scaling decisions. * more agent synchronization improvements * Deal with teh CP pulling out active hosts from under us. * lock at lbgroup level * Send request and receive response from runner. * Add capacity check right before slot reservation * Pass the full Call into the receive loop. * Wait for the data from the runner before finishing * force runner list refresh every time * Don't init db and mq for pure runners * adds shutdown of npm * fixes broken log line * Extract an interface for the Predictor used by the NPM * purge drained connections from npm * Refactor of the LB agent into the agent package * removes capacitytest wip * Fix undefined err issue * updating README for poolmanager set up * ues retrying dial for lb to npm connections * Rename lb_calls to lb_agent now that all functionality is there * Use the right deadline and errors in LBAgent * Make stream error flag per-call rather than global otherwise the whole runner is damaged by one call dropping * abstracting gRPCNodePool * Make stream error flag per-call rather than global otherwise the whole runner is damaged by one call dropping * Add some init checks for LB and pure runner nodes * adding some useful debug * Fix default db and mq for lb node * removes unreachable code, fixes typo * Use datastore as logstore in API nodes. This fixes a bug caused by trying to insert logs into a nil logstore. It was nil because it wasn't being set for API nodes. * creates placement abstraction and moves capacity APIs to NodePool * removed TODO, added logging * Dial reconnections for LB <-> runners LB grpc connections to runners are established using a backoff stategy in event of reconnections, this allows to let the LB up even in case one of the runners go away and reconnect to it as soon as it is back. * Add a status call to the Runner protocol Stub at the moment. To be used for things like draindown, health checks. * Remove comment. * makes assign/release capacity lockless * Fix hanging issue in lb agent when connections drop * Add the CH hash from fnlb Select this with FN_PLACER=ch when launching the LB. * small improvement for locking on reloadLBGmembership * Stabilise the list of Runenrs returned by NodePool The NodePoolManager makes some attempt to keep the list of runner nodes advertised as stable as possible. Let's preserve this effort in the client side. The main point of this is to attempt to keep the same runner at the same inxed in the []Runner returned by NodePool.Runners(lbgid); the ch algorithm likes it when this is the case. * Factor out a generator function for the Runners so that mocks can be injected * temporarily allow lbgroup to be specified in HTTP header, while we sort out changes to the model * fixes bug with nil runners * Initial work for mocking things in tests * fix for anonymouse go routine error * fixing lb_test to compile * Refactor: internal objects for gRPCNodePool are now injectable, with defaults for the real world case * Make GRPC port configurable, fix weird handling of web port too * unit test reload Members * check on runner creation failure * adding nullRunner in case of failure during runner creation * Refactored capacity advertisements/aggregations. Made grpc advertisement post asynchronous and non-blocking. * make capacityEntry private * Change the runner gRPC bind address. This uses the existing `whoAmI` function, so that the gRPC server works when the runner is running on a different host. * Add support for multiple fixed runners to pool mgr * Added harness for dataplane system tests, minor refactors * Add Dockerfiles for components, along with docs. * Doc fix: second runner needs a different name. * Let us have three runners in system tests, why not * The first system test running a function in API/LB/PureRunner mode * Add unit test for Advertiser logic * Fix issue with Pure Runner not sending the last data frame * use config in models.Call as a temporary mechanism to override lb group ID * make gofmt happy * Updates documentation for how to configure lb groups for an app/route * small refactor unit test * Factor NodePool into its own package * Lots of fixes to Pure Runner - concurrency woes with errors and cancellations * New dataplane with static runnerpool (#813) Added static node pool as default implementation * moved nullRunner to grpc package * remove duplication in README * fix go vet issues * Fix server initialisation in api tests * Tiny logging changes in pool manager. Using `WithError` instead of `Errorf` when appropriate. * Change some log levels in the pure runner * fixing readme * moves multitenant compute documentation * adds introduction to multitenant readme * Proper triggering of system tests in makefile * Fix insructions about starting up the components * Change db file for system tests to avoid contention in parallel tests * fixes revisions from merge * Fix merge issue with handling of reserved slot * renaming nulb to lb in the doc and images folder * better TryExec sleep logic clean shutdown In this change we implement a better way to deal with the sleep inside the for loop during the attempt for placing a call. Plus we added a clean way to shutdown the connections with external component when we shut down the server. * System_test mysql port set mysql port for system test to a different value to the one set for the api tests to avoid conflicts as they can run in parallel. * change the container name for system-test * removes flaky test TestRouteRunnerExecution pending resolution by issue #796 * amend remove_containers to remove new added containers * Rework capacity reservation logic at a higher level for now * LB agent implements Submit rather than delegating. * Fix go vet linting errors * Changed a couple of error levels * Fix formatting * removes commmented out test * adds snappy to vendor directory * updates Gopkg and vendor directories, removing snappy and addhing siphash * wait for db containers to come up before starting the tests * make system tests start API node on 8085 to avoid port conflict with api_tests * avoid port conflicts with api_test.sh which are run in parallel * fixes postgres port conflict and issue with removal of old containers * Remove spurious println	2018-03-08 14:45:19 -08:00
Tolga Ceylan	89a1fc7c72	Response size clamp (#786 ) ) Limit response http body or json response size to FN_MAX_RESPONSE_SIZE (default unlimited) ) If limits are exceeded 502 is returned with 'body too large' in the error message	2018-03-01 17:14:50 -08:00
Denis Makogon	91d7874e34	FDK Node reference (#793 )	2018-02-27 18:06:21 +02:00
Joshua Smith	33e686ef39	removed unused item in swagger (#780 ) (#626) the Version is unused in the swagger. Remove that item from the definitions.	2018-02-26 11:23:50 -07:00
Travis Reeder	575e1d3d0c	Removes "type" from json format. Was pointless. (#783 )	2018-02-20 12:04:08 -08:00
Travis Reeder	5eb534f243	fix bug	2018-02-16 07:38:53 -08:00
★★ (งツ)ว ★★	aee4f7f406	`Premises` not `premise` (#767 ) [skip ci] The common mistake. `premise` means something else.	2018-02-15 08:02:38 +00:00
Tolga Ceylan	c848fc6181	fn: hot container timer improvements (#751 ) * fn: hot container timer improvements With this change, now we are allocating the timers when the container starts and managing them via stop/clear as needed, which should not only be more efficient, but also easier to follow. For example, previously, if eject time out was set to 10 secs, this could have delayed idle timeout up to 10 secs as well. It is also not necessary to do any math for elapsed time. Now consumers avoid any requeuing when startDequeuer() is cancelled. This was triggering additional dequeue/requeue causing containers to wake up spuriously. Also in startDequeuer(), we no longer remove the item from the actual queue and leave this to acquire/eject, which side steps issues related with item landing in the channel, not consumed, etc.	2018-02-12 14:12:03 -08:00
Tolga Ceylan	f27d47f2dd	Idle Hot Container Freeze/Preempt Support (#733 ) * fn: freeze/unfreeze and eject idle under resource contention	2018-02-07 17:21:53 -08:00
Dario Domizioli	e753732bd8	Hot protocols improvements (for 662) (#724 ) * Improve deadline handling in streaming protocols * Move special headers handling down to the protocols * Adding function format documentation for JSON changes * Add tests for request url and method in JSON protocol * Fix protocol missing fn-specific info * Fix import * Add panic for something that should never happen	2018-01-31 12:26:43 +00:00
Travis Reeder	7ace234848	Cleaned up main readme a bit (#693 ) * Cleaned up main readme a bit * Update README.md	2018-01-24 09:31:28 -08:00
Chad Arimura	43c9c5a5c5	remove some old non-fn assets that snuck in (#718 )	2018-01-24 07:38:04 -08:00
Denis Makogon	d3be603e54	Fnlb was moved to its own repo: fnproject/lb (#702 ) * Fnlb was moved to its own repo: fnproject/lb * Clean up fnlb leftovers * Newer deps	2018-01-22 14:17:29 -08:00
Reed Allman	899cc027b5	fixes header format in function file docs (#711 ) closes #145	2018-01-22 12:02:14 -08:00
Reed Allman	333d07c58d	add config placement info to docs (#703 ) this behavior was recently cemented but was entirely omitted from the doc on 'how to write functions'	2018-01-18 15:00:32 -08:00
Travis Reeder	5a2602d42e	Updated docs, cleaned things up, DRY's up function format stuff. (#688 ) * Updated docs, cleaned things up, DRY's up function format stuff. * deleted files * updated bad link	2018-01-16 14:45:44 -08:00
Tolga Ceylan	aa17e3b7e0	fn: cpus documentation (#682 )	2018-01-12 14:49:03 -08:00
Tolga Ceylan	39b2cb2d9b	Cpu resources (#642 ) * fn: cpu quota implementation	2018-01-12 11:38:28 -08:00
Nigel Deakin	d1e02f42ed	Update writing.md (#668 ) Trying to make this a bit clearer	2018-01-10 09:21:55 -08:00
Reed Allman	24aa911609	add FN_LOG_DEST for logs, fixup init (#663 ) * add FN_LOG_DEST for logs, fixup init * FN_LOG_DEST can point to a remote logging place (papertrail, whatever) * FN_LOG_PREFIX can add a prefix onto each log line sent to FN_LOG_DEST default remains stderr with no prefix. users need this to send to various logging backends, though it could be done operationally, this is somewhat simpler. we were doing some configuration stuff inside of init() for some of the global things. even though they're global, it's nice to keep them all in the normal server init path. we have had strange issues with the tracing setup, I tested the last repro of this repeatedly and didn't have any luck reproducing it, though maybe it comes back. * add docs	2018-01-09 14:27:50 -08:00

1 2 3 4 5 ...

272 Commits