fn-serverless

mirror of https://github.com/fnproject/fn.git synced 2022-10-28 21:29:17 +03:00

Author	SHA1	Message	Date
Reed Allman	2b797a556a	update docs with pro tips for fdk http stream people (#1211 ) * update docs with pro tips for fdk http stream people * fix bug where container could die before uds wait we used to hang out for an hour. oopsie, thanks Owen	2018-09-14 16:54:18 +01:00
Reed Allman	3a9c48b8a3	http-stream format (#1202 ) * POC code for inotify UDS-io-socket * http-stream format introducing the `http-stream` format support in fn. there are many details for this, none of which can be linked from github :( -- docs are coming (I could even try to add some here?). this is kinda MVP-ish level, but does not implement the remaining spec, ie 'headers' fixing up / invoke fixing up. the thinking being we can land this to test fdks / cli with and start splitting work up on top of this. all other formats work the same as previous (no breakage, only new stuff) with the cli you can set `format: http-stream` and deploy, and then invoke a function via the `http-stream` format. this uses unix domain socket (uds) on the container instead of previous stdin/stdout, and fdks will have to support this in a new fashion (will see about getting docs on here). fdk-go works, which is here: https://github.com/fnproject/fdk-go/pull/30 . the output looks the same as an http format function when invoking a function. wahoo. there's some amount of stuff we can clean up here, enumerated: * the cleanup of the sock files is iffy, high pri here * permissions are a pain in the ass and i punted on dealing with them. you can run `sudo ./fnserver` if running locally, it may/may not work in dind(?) ootb * no pipe usage at all (yay), still could reduce buffer usage around the pipe behavior, we could clean this up potentially before removal (and tests) * my brain can’t figure out if dispatchOldFormats changes pipe behavior, but tests work * i marked XXX to do some clean up which will follow soon… need this to test fdk tho so meh, any thoughts on those marked would be appreciated however (1 less decision for me). mostly happy w/ general shape/plumbing tho * there are no tests atm, this is a tricky dance indeed. attempts were made. need to futz with the permission stuff before committing to adding any tests here, which I don't like either. also, need to get the fdk-go based test image updated according to the fdk-go, and there's a dance there too. rumba time.. * delaying the big big cleanup until we have good enough fdk support to kill all the other formats. open to ideas on how to maneuver landing stuff... * fix unmount * see if the tests work on ci... * add call id header * fix up makefile * add configurable iofs opts * add format file describing http-stream contract * rm some cruft * default iofs to /tmp, remove mounting out of the box fn we can't mount. /tmp will provide a memory backed fs for us on most systems, this will be fine for local developing and this can be configured to be wherever for anyone that wants to make things more difficult for themselves. also removes the mounting, this has to be done as root. we can't do this in the oss fn (short of requesting root, but no). in the future, we may want to have a knob here to have a function that can be configured in fn that allows further configuration here. since we don't know what we need in this dept really, not doing that yet (it may be the case that it could be done operationally outside of fn, eg, but not if each directory needs to be configured itself, which seems likely, anyway...) * add WIP note just in case...	2018-09-14 10:59:12 +01:00
Tolga Ceylan	4dcdb7d982	fn: paused and evicted container stats (#1209 ) * fn: paused and evicted container stats With this change, now stats reports paused state as well as incidents of container exit due to evictions. * fn: update/document state transitions in state tracker There's no case of a transition moving from done to waiting. This must be deprecated behavior.	2018-09-13 16:24:26 -07:00
Tolga Ceylan	586d5c4735	fn: make call.End() to blocking to reduce complexity (#1208 ) agent/lb-agent/runner roles execute call.End() in the background in some cases to reduce latency. With this change, we simplify this and switch to non-background execution of call.End(). This fixes hard to detect issues such as non-deterministic calculation of call.CompletedAt or incomplete Call.Stats in runners. Downstream projects if impacted by the now blocking call.End() latency should take steps to handle this according to their requirements.	2018-09-13 11:28:11 +01:00
Tom Coupland	a0ccc4d7c4	Copy logs up to v2 endpoints (#1207 ) Copies the log endpoints up to the V2 endpoints, in a similar way to the call endpoints. The main change is to when logs are inserted into S3. The signature of the function has been changed to take the whole call object, rather than just the app and call id's. This allows the function to switch between calls for Routes and those for Fns. Obviously this switching can be removed when v1 is removed. In the sql implementation it inserts with both appID and fnID, this allows the two get's to work, and the down grade of the migration. When the v1 logs are removed, the appId can be dropped. The log fetch test and error messages have been changed to be FnID specific.	2018-09-13 10:30:10 +01:00
Tolga Ceylan	aabbe0fba5	fn: check context timeout when waiting for non-blocking attach (#1201 ) * fn: check context timeout when waiting for non-blocking attach With this change, we no longer allow docker client AttachToContainerNonBlocking to block on Success channel more than our context deadline/timeout. * fn: move nbio chan handling in attach to docker from docker-client	2018-09-12 13:01:51 -07:00
Tolga Ceylan	6226af933a	fn: slot metrics/stats should be in stats/metrics removing logging (#1200 ) Slot stats are too noisy. These should be (or shortly will be) in metrics/stats/tracing.	2018-09-10 16:30:25 -07:00
Tolga Ceylan	bb8436c3ee	fn: docker driver stats/metrics for prometheus (#1197 ) * fn: docker driver stats/metrics for prometheus	2018-09-10 13:35:50 -07:00
Gerardo Viedma	0e01f3e547	Gracefully handles client request cancelations, instead of treating treating them as server errors (#1194 ) * Gracefully handles client request cancelations, instead of logging them as a 500 error * adds runner_addr to runner client logs	2018-09-05 07:53:48 +01:00
Reed Allman	7638b31e11	use tini to run every container (#1195 ) fixes #1101 additional context: * this was introduced in docker 1.13 (1/2017), we require docker 17.10 (10/2017), this should not have any issues dependency-wise, as `docker-init` is in the docker install from that point in time. unless explicitly removed, it should be in the dind container we use as well... * the PR that introduced this to docker is https://github.com/moby/moby/pull/26061 for additional context * it may be wise to put this through some paces, if anybody has any... interesting... function containers. the tests seem to work fine, however, and this shouldn't be something users have to think about (?) at all, just something that we are doing. this isn't the default in docker for compatibility reasons, which is maybe a yellow flag but I am not sure tbh	2018-09-04 15:41:30 -07:00
Tolga Ceylan	ad011fde7f	fn: introducing docker-syslog driver as default logger (#1189 ) * fn: introducing docker-syslog driver as default logger With this change, fn-agent prefers RFC2454 docker-syslog driver for logging stdout/stderr from containers. The advantage of this is to offload it to docker itself instead of streaming stderr along with stdout, which gets multiplexed through single connection via docker-API. The change will need support from FDKs in order to log correct call-id and supress '\n' that splits syslog lines.	2018-08-29 13:08:02 -07:00
Gerardo Viedma	802832436c	Sets FN_PATH in models.Call for fn invoke requests (#1192 )	2018-08-29 12:58:39 +01:00
Reed Allman	292f673747	Go1.11 (#1188 ) * update circleci to go1.11 * update opencensus dep to build with go1.11 * fix up for new gofmt rules	2018-08-27 10:55:52 -07:00
Reed Allman	9cac4c8eea	update fsouza to v1.2.0 (#1186 ) * update fsouza to v1.2.0 * unwind timeouts on docker previously, we were setting our own transport on the docker client, but this does not work anymore as fsouza now needs to call this: https://github.com/fsouza/go-dockerclient/blob/master/client_unix.go which makes a platform dependent client. fsouza now also appears to make a transport that modifies the default http client with some saner values for things like max idle conns per host (they get reaped if idle 90s): https://github.com/fsouza/go-dockerclient/blob/master/client.go#L1059 -- these settings are sane and were why we were doing this to begin with. additionally, have removed our setting of timeout on the docker client for 2 minutes. this is a leftover relic of a bygone era from a time when we relied on these timeouts to timeout higher level things, which now we're properly timing out in the enclosing methods. so, they gone, this makes the docker client a little less whacky now.	2018-08-24 11:36:02 -07:00
Reed Allman	a6d60551ab	disable user function logs at debug level config (#1179 )	2018-08-21 21:02:49 -07:00
Tom Coupland	79a7308a17	Adding Fn invoke endpoint that works just like triggers endpoint (#1168 )	2018-08-13 10:01:52 +01:00
Peter Jausovec	35408ac949	Change the syslog format to use app_name instead of app_id (#1166 ) * Add AppName to the models.Call, so we can include it in the syslog * Replace the app_id with app_name	2018-08-09 12:06:19 -07:00
Tolga Ceylan	f57571fb3a	fn: SSL config adjustments (#1160 ) SSL related FN_NODE_CERT (and related) settings are not very clear today. Removing this in favor of a simple map of tls.Config objects. Three keys are provided for this map: TLSGRPCServer TLSAdminServer TLSWebServer which correspond to server TLS settings for the associated services. Operators/implementers can further add more keys to the map and add their own TLS config.	2018-08-06 20:57:03 -07:00
Tolga Ceylan	b6aeae3680	fn: moving opencensus distribution buckets out of agent (#1158 ) Users can best pick the proper range for their operating environment. Default cmd/fnserver uses some sensible defaults.	2018-08-06 10:48:52 -07:00
Tolga Ceylan	b524a94651	fn: fix math error in calculating msecs in container states (#1157 )	2018-08-03 17:25:01 -07:00
Owen Cliffe	c3a46f9452	Use sha256 for slot token (#1155 )	2018-08-03 19:07:28 +01:00
Tolga Ceylan	0105f8321e	fn: stats view/distribution improvements (#1154 ) * fn: stats view/distribution improvements ) View latency distribution is now an argument in view creation functions. This allows easier override to set custom buckets. It is simplistic and assumes all latency views would use the same set, but in practice this is already the case. ) Removed API view creation to main, this should not be enabled for all node types. This is consistent with the rest of the system. * fn: Docker samples of cpu/mem/disk with specific buckets	2018-08-03 11:06:54 -07:00
Reed Allman	af94f3f8ac	move max_request_size from agent to server (#1145 ) moves the config option for max request size up to the front end, adds the env var for it there, adds a server test for it and removes it from agent. a request is either gonna come through the lb (before grpc) or to the server, we can handle limiting the request there at least now, which may be easier than having multiple layers of request body checking. this aligns with not making the agent as responsible for http behaviors (eventually, not at all once route is fully deprecated).	2018-07-31 08:58:47 -07:00
Reed Allman	409c104df3	make agent options/config pass lint checks (#1144 )	2018-07-30 16:04:27 -07:00
Tolga Ceylan	9f29d824d6	fn: New timeout for LB Placer (#1137 ) * fn: New timeout for LB Placer Previously, LB Placers worked hard as long as client contexts allowed for. Adding a Placer config setting to bound this by 360 seconds by default. The new timeout is not accounted during actual function execution and only applies to the amount of wait time in Placers when the call is not being executed.	2018-07-26 10:19:25 -07:00
Tolga Ceylan	2706323cec	fn: tests for private repo auth and rename DOCKER_AUTH (#1134 ) Renamed DOCKER_AUTH with FN_ prefix to clarify the purpose. Docker does not use this variable. New tests to clarify the repo/auth-config behavior.	2018-07-24 15:19:59 -07:00
Tolga Ceylan	cf37a21fab	fn: cleanup of docker private registry code (#1130 ) * fn: cleanup of docker private registry code Start using URL parsed ServerAddress and its subdomains for easier image ensure/pull in docker driver. Previous code to lookup substrings was faulty without proper URL parse and hostname tokenization. When searching for a registry config, if image name does not contain a registry and if there's a private registry configured, then search for hub.docker.com and index.docker.io. This is similar to previous code but with correct subdomain matching. * fn-dataplane: take port into account in auth configs	2018-07-24 02:15:25 +01:00
Tolga Ceylan	fc71208063	fn: add context into to logger passed to DialWithBackoff (#1133 )	2018-07-23 13:05:30 -07:00
Tolga Ceylan	db7cbf73e2	fn: add requests received/handled in Status responses (#1132 ) This is useful as additional data to inflight requests. Callers can determine request arrival and processing rate.	2018-07-20 16:00:02 -07:00
Tolga Ceylan	1258baeb7f	fn: agent eviction revisited (#1131 ) * fn: agent eviction revisited Previously, the hot-container eviction logic used number of waiters of cpu/mem resources to decide to evict a container. An ejection ticker used to wake up its associated container every 1 sec to reasses system load based on waiter count. However, this does not work for non-blocking agent since there are no waiters for non-blocking mode. Background on blocking versus non-blocking agent: ) Blocking agent holds a request until the the request is serviced or client times out. It assumes the request can be eventually serviced when idle containers eject themselves or busy containers finish their work. ) Non-blocking mode tries to limit this wait time. However non-blocking agent has never been truly non-blocking. This simply means that we only make a request wait if we take some action in the system. Non-blocking agents are configured with a much higher hot poll frequency to make the system more responsive as well as to handle cases where an too-busy event is missed by the request. This is because the communication between hot-launcher and waiting requests are not 1-1 and lossy if another request arrives for the same slot queue and receives a too-busy response before the original request. Introducing an evictor where each hot container can register itself, if it is idle for more than 1 seconds. Upon registry, these idle containers become eligible for eviction. In hot container launcher, in non-blocking mode, before we attempt to emit a too-busy response, now we attempt an evict. If this is successful, then we wait some more. This could result in requests waiting for more than they used to only if a container was evicted. For blocking-mode, the hot launcher uses hot-poll period to assess if a request has waited for too long, then eviction is triggered.	2018-07-19 15:04:15 -07:00
Tolga Ceylan	e9d5221e15	fn: Status gRPC call timeout handling (#1125 ) Status calls should not directly use client gRPC context deadlines/timeouts during Status execution. Status should allow plenty of time for the scheduler agent and docker to run and emit useful error information. Setting this timeout to 60 seconds, which should surface disk I/O, docker, etc. issues.	2018-07-16 18:33:23 -07:00
Tolga Ceylan	564db4e9d2	fn: Status should expose if data was served from cache. (#1123 ) This is useful in scenarios where gRPC client might want to reliably observe/report the status latency metrics and remove any possible duplicates. If the status query was served from cache, then these latencies show last execution latency.	2018-07-13 17:35:00 -07:00
Tolga Ceylan	5dc5740a54	fn: runner status and docker load images (#1116 ) * fn: runner status and docker load images Introducing a function run for pure runner Status calls. Previously, Status gRPC calls returned active inflight request counts with the purpose of a simple health checker. However this is not sufficient since it does not show if agent or docker is healthy. With this change, if pure runner is configured with a status image, that image is executed through docker. The call uses zero memory/cpu/tmpsize settings to ensure resource tracker does not block it. However, operators might not always have a docker repository accessible/available for status image. Or operators might not want the status to go over the network. To allow such cases, and in general possibly caching docker images, added a new environment variable FN_DOCKER_LOAD_FILE. If this is set, fn-agent during startup will load these images that were previously saved with 'docker save' into docker.	2018-07-12 13:58:38 -07:00
Owen Cliffe	fff95e7992	Clean up/make consistent the APIs for registering core components, make Docker an optional component at compile time (#1111 )	2018-07-07 10:37:19 +01:00
Owen Cliffe	b8b544ed25	HTTP Triggers hookup (#1086 ) * Initial suypport for invoking tiggers * dupe method * tighten server constraints * runner tests not working yet * basic route tests passing * post rebase fixes * add hybrid support for trigger invoke and tests * consoloidate all hybrid evil into one place * cleanup and make triggers unique by source * fix oops with Agent * linting * review fixes	2018-07-05 12:56:07 -05:00
Tolga Ceylan	300fcd7d92	fn: applications should be aware of reserved writable space (#1083 ) Similar to FN_MEMORY, we pass FN_TMPSIZE to function config.	2018-07-03 16:04:48 -07:00
Tolga Ceylan	317de18e6b	fn: lb-agent: Add Runner Scheduler/Execution Stats (#1107 ) LB agent reports lb placer latency. It should also report how long it took for the runner to initiate the call as well as execution time inside the container if the runner has accepted (committed) to the call.	2018-07-02 17:15:43 -07:00
Tom Coupland	3ebff051a4	Add support for Function and Trigger domain objects (#1060 ) Vast commit, includes: * Introduces the Trigger domain entity. * Introduces the Fns domain entity. * V2 of the API for interacting with the new entities in swaggerv2.yml * Adds v2 end points for Apps to support PUT updates. * Rewrites the datastore level tests into a new pattern. * V2 routes use entity ID over name as the path parameter.	2018-06-25 15:37:06 +01:00
Reed Allman	51ff7caeb2	Bye bye openapi (#1081 ) * add DateTime sans mgo * change all uses of strfmt.DateTime to common.DateTime, remove test strfmt usage * remove api tests, system-test dep on api test multiple reasons to remove the api tests: * awkward dependency with fn_go meant generating bindings on a branched fn to vendor those to test new stuff. this is at a minimum not at all intuitive, worth it, nor a fun way to spend the finite amount of time we have to live. * api tests only tested a subset of functionality that the server/ api tests already test, and we risk having tests where one tests some thing and the other doesn't. let's not. we have too many test suites as it is, and these pretty much only test that we updated the fn_go bindings, which is actually a hassle as noted above and the cli will pretty quickly figure out anyway. * fn_go relies on openapi, which relies on mgo, which is deprecated and we'd like to remove as a dependency. openapi is a _huge_ dep built in a NIH fashion, that cannot simply remove the mgo dep as users may be using it. we've now stolen their date time and otherwise killed usage of it in fn core, for fn_go it still exists but that's less of a problem. * update deps removals: * easyjson * mgo * go-openapi * mapstructure * fn_go * purell * go-validator also, had to lock docker. we shouldn't use docker on master anyway, they strongly advise against that. had no luck with latest version rev, so i locked it to what we were using before. until next time. the rest is just playing dep roulette, those end up removing a ton tho * fix exec test to work * account for john le cache	2018-06-21 11:09:16 -07:00
Tolga Ceylan	881a0ba1db	fn: agent call overrider (#1080 ) Similar to LB Agent call overrider, this PR adds Agent overrider for Agents to modify/analyze a Call/Extensions during GetCall().	2018-06-20 16:21:09 -07:00
Tolga Ceylan	e67d0e5f3f	fn: Call extensions/overriding and more customization friendly docker driver (#1065 ) In pure-runner and LB agent, service providers might want to set specific driver options. For example, to add cpu-shares to functions, LB can add the information as extensions to the Call and pass this via gRPC to runners. Runners then pick these extensions from gRPC call and pass it to driver. Using a custom driver implementation, pure-runners can process these extensions to modify docker.CreateContainerOptions. To achieve this, LB agents can now be configured using a call overrider. Pure-runners can be configured using a custom docker driver. RunnerCall and Call interfaces both expose call extensions. An example to demonstrate this is implemented in test/fn-system-tests/system_test.go which registers a call overrider for LB agent as well as a simple custom docker driver. In this example, LB agent adds a key-value to extensions and runners add this key-value as an environment variable to the container.	2018-06-18 14:42:28 -07:00
Andrea Rosa	e637661ea2	Adding a way to inject a request ID (#1046 ) * Adding a way to inject a request ID It is very useful to associate a request ID to each incoming request, this change allows to provide a function to do that via Server Option. The change comes with a default function which will generate a new request ID. The request ID is put in the request context along with a common logger which always logs the request-id We add gRPC interceptors to the server so it can get the request ID out of the gRPC metadata and put it in the common logger stored in the context so as all the log lines using the common logger from the context will have the request ID logged	2018-06-14 10:40:55 +01:00
Peter Jausovec	bd5150f1ac	Extract register view functionality (#1056 ) * WIP * Create separate Register*Views functions that are called from main.	2018-06-12 17:24:21 +01:00
Owen Cliffe	1ad27f4f0d	Inverting deps on SQL, Log and MQ plugins to make them optional dependencies of extended servers, Removing some dead code that brought in unused dependencies Filtering out some non-linux transitive deps. (#1057 ) * initial Db helper split - make SQL and datastore packages optional * abstracting log store * break out DB, MQ and log drivers as extensions * cleanup * fewer deps * fixing docker test * hmm dbness * updating db startup * Consolidate all your extensions into one convenient package * cleanup * clean up dep constraints	2018-06-11 18:23:28 +01:00
Tolga Ceylan	fce1e54746	fn: remove dead code in static pool (#1052 ) Static pool is oriented for testing/basic usage and as it's name implies it is a static pool. Therefore, removing unnecessary/dead code.	2018-06-08 15:57:06 -07:00
Tolga Ceylan	8f969918bd	fn: removing unused/dead code (#1051 )	2018-06-08 15:51:19 -07:00
Tolga Ceylan	4fcb52f69d	fn: MaxTotalCPU and MaxTotalMemory in non-Linux systems (#1043 ) Non-Linux systems skip some of memory/cpu determination code in resource tracker. But config settings to cap these are used in tests, so they must not be ignored. With this change, we apply these config settings even on non-Linux systems. Memory allocation code is also now same in non-Linux systems, but default is raised to 2GB from 1.5GB.	2018-06-06 14:50:21 -07:00
Owen Cliffe	c6abc8bf64	Use context logging more to ensure context vars are present in log lines (#1039 )	2018-06-06 15:14:29 +01:00
Tolga Ceylan	4af53025d8	fn: lb-agent: Initial TryCall result can be retriable. (#1035 ) Before this change, we assumed data may end up in a container once we placed a TryCall() and if gRPC send failed, we did not retry. However, a send failure cannot result in data in a container, since only upon successful receipt of a TryCall can pure-runner schedule a call into a container. Here we trust gRPC and if gRPC layer says it could not send a msg, then the receiver did not receive it.	2018-06-05 14:41:13 -07:00
Andrea Rosa	c2c295ffb3	Add a LBAgent constructor which accept AgentConfig (#1037 ) In some cases could be useful to pass Agent configurations to the LnAgent constuctor, this small change adds a new constructor which accepts an agent configuration as additional parameter.	2018-06-05 13:59:43 -07:00

1 2 3 4 5 ...

274 Commits