Commit Graph

579 Commits

Author SHA1 Message Date
Tolga Ceylan
ef5c35c6f0 fn: add http.Server options for web/admin/grpc services in server (#1191)
* fn: add httpç to Server

This allows to time limit slow/malicious clients when
reading HTTP headers. In GetBody() buffering, same timeout
can be used to time limit to give consistent I/O wait
limits for the service in addition to per handler
imposed limits we already have.

* fn: generic http Server settings for services
2018-09-12 11:41:06 -07:00
Tom Coupland
3c95d80dce Change not present v2 call endpoint respone to Gone (#1206)
Currently, when the calls endpoints are disabled a 501 is
returned. While this is technically correct, it's not hard to see this
causing trouble when people tend to create 5xx roll up alerting
metrics.

This changes it to a 410, Gone, response, which is close enough and
should allow clients to know what's going on.
2018-09-12 16:27:34 +01:00
Tom Coupland
c3537399f1 The V2 Calls api endpoints have been added beneath fns: (#1203)
/fns/{fnID}/calls
/fns/{fnID}/calls/{callID}

The S3 implementation forces our hand as we if we want to list Calls
under a Fn, we have to use the FnID as a prefix on the object names,
which mean we need it to look up any Call. It also makes sense in
terms of resource hierarchy.

These endpoints can optionally be disabled (as other endpoints), if a
service provider needs to provide this functionality via other means.

The 'calls' test has been fully migrated to fn calls. This has been
done to reduce the copy pasta a bit, and on balance is ok as the
routes calls will be removed soon.
2018-09-12 15:45:53 +01:00
Tolga Ceylan
35d04cae6d fn: handle client connection close errors (#1196) 2018-09-05 16:56:31 -07:00
Gerardo Viedma
0e01f3e547 Gracefully handles client request cancelations, instead of treating treating them as server errors (#1194)
* Gracefully handles client request cancelations, instead of logging them as a 500 error

* adds runner_addr to runner client logs
2018-09-05 07:53:48 +01:00
Srinidhi Chokkadi Puranik
6e20cf8788 Pass right context in call to datastore.UpdateTrigger (#1185) 2018-08-23 21:44:15 -07:00
Tom Coupland
fc3f54d2da Insist trigger sources are prefixed (#1184)
* Insist trigger sources are prefixed

All trigger sources must have a '/' prefix to be allowed into the datastore.

* Adding condition to novelValue for gen tests

NovelValue was failing to detect same Config values correctly. This
adds a specific check for Config, like the one for Annotation, to
ensure a novel value is indeed generated.
2018-08-23 12:24:56 +01:00
James Jeffrey
d336035678 Add annotation to trigger on create if endpoints are enabled (#1177)
* Add annotations for creation of triggers and fns along with the test for them fixes #1178

* Log errors and still return created resource for annotation failures
2018-08-21 10:26:36 +01:00
Tom Coupland
b1938c1cbf Fns now annotated with invoke urls, as per triggers (#1172)
Clone of the trigger work to inject invoke urls into the annotations
on a fn when it is returned from the server.

Small changes to trigges code following code review of the fn code.
2018-08-16 09:44:48 +01:00
Tom Coupland
79a7308a17 Adding Fn invoke endpoint that works just like triggers endpoint (#1168) 2018-08-13 10:01:52 +01:00
Tolga Ceylan
976b91a77d fn: API stats and tags reoorganization (#1171)
Make sure we can apply extra tags if RegisterAPIViews() is
provided with such tags. Deduplicate path/method/status and
always apply these default tags to appropriate views.
2018-08-11 17:00:37 -07:00
Tolga Ceylan
8c271e8556 fn: add missing api response count in API metrics (#1170) 2018-08-10 12:14:04 -07:00
Tolga Ceylan
f57571fb3a fn: SSL config adjustments (#1160)
SSL related FN_NODE_CERT (and related) settings are
not very clear today. Removing this in favor of a
simple map of tls.Config objects. Three keys are
provided for this map:

TLSGRPCServer
TLSAdminServer
TLSWebServer

which correspond to server TLS settings for the
associated services.

Operators/implementers can further add more
keys to the map and add their own TLS config.
2018-08-06 20:57:03 -07:00
Tolga Ceylan
0105f8321e fn: stats view/distribution improvements (#1154)
* fn: stats view/distribution improvements

*) View latency distribution is now an argument
in view creation functions. This allows easier
override to set custom buckets. It is simplistic
and assumes all latency views would use the same
set, but in practice this is already the case.
*) Removed API view creation to main, this should not
be enabled for all node types. This is consistent with
the rest of the system.

* fn: Docker samples of cpu/mem/disk with specific buckets
2018-08-03 11:06:54 -07:00
Owen Cliffe
9b1f5e9cee Add server API to disable hybrid API on API servers (#1152) 2018-08-01 18:53:38 +01:00
Gerardo Viedma
23fc03c9f4 Expose ServeRoute method on Server to allow extensions to plugin custom route handling (#1151) 2018-08-01 09:57:12 +01:00
Reed Allman
af94f3f8ac move max_request_size from agent to server (#1145)
moves the config option for max request size up to the front end, adds the env
var for it there, adds a server test for it and removes it from agent. a
request is either gonna come through the lb (before grpc) or to the server, we
can handle limiting the request there at least now, which may be easier than
having multiple layers of request body checking. this aligns with not making
the agent as responsible for http behaviors (eventually, not at all once route
is fully deprecated).
2018-07-31 08:58:47 -07:00
james h
61d42e7621 Support _FILE postfixes for environment variables (#1142)
* Support _FILE postfixes for environment variables to be loaded from files

* fix gofmt error
2018-07-30 11:13:06 -07:00
Tolga Ceylan
9f29d824d6 fn: New timeout for LB Placer (#1137)
* fn: New timeout for LB Placer

Previously, LB Placers worked hard as long as
client contexts allowed for. Adding a Placer
config setting to bound this by 360 seconds by
default.

The new timeout is not accounted during actual
function execution and only applies to the amount
of wait time in Placers when the call is not
being executed.
2018-07-26 10:19:25 -07:00
Owen Cliffe
1d5892b0c6 Fix/test trigger annotations (#1126)
* Fix/test trigger annotations
2018-07-17 13:54:26 +01:00
Owen Cliffe
fff95e7992 Clean up/make consistent the APIs for registering core components, make Docker an optional component at compile time (#1111) 2018-07-07 10:37:19 +01:00
Owen Cliffe
b8b544ed25 HTTP Triggers hookup (#1086)
* Initial suypport for invoking tiggers

* dupe method

* tighten server constraints

* runner tests not working yet

* basic route tests passing

* post rebase fixes

* add hybrid support for trigger invoke and tests

* consoloidate all hybrid evil into one place

* cleanup and make triggers unique by source

* fix oops with Agent

* linting

* review fixes
2018-07-05 12:56:07 -05:00
Reed Allman
1cdb47d6e9 server, examples, extensions lint compliant (#1109)
these are all automated changes suggested by golint
2018-07-04 15:23:15 +01:00
Owen Cliffe
5d970d9295 Set shortcodes in trigger response entities based on config or request URL (#1099)
* adding trigger short code injection

* more annotation provider stuff

* fixed up tests

* Fix validator
2018-07-03 15:59:00 -05:00
Tolga Ceylan
317de18e6b fn: lb-agent: Add Runner Scheduler/Execution Stats (#1107)
LB agent reports lb placer latency. It should also report
how long it took for the runner to initiate the call as
well as execution time inside the container if the runner
has accepted (committed) to the call.
2018-07-02 17:15:43 -07:00
Tom Coupland
d7139358ce List Cursor management moved into datastore layer. (#1102)
* Don't try to delete an app that wasn't successfully created in the case of failure

* Allow datastore implementations to inject additional annotations on objects

* Allow for datastores transparently adding annotations on apps, fns and triggers. Change NameIn filter to Name for apps.

* Move *List types including JSON annotations for App, Fn and Trigger into models

* Change return types for GetApps, GetFns and GetTriggers on datastore to
be models.*List and ove cursor generation into datastore

* Trigger cursor handling fixed into db layer

Also changes the name generation so that it is not in the same order
as the id (well is random), this means we are now testing our name ordering.

* GetFns now respects cursors

* Apps now feeds cursor back

* Mock fixes

* Fixing up api level cursor decoding

* Tidy up treatment of cursors in the db layer

* Adding conditions for non nil items lists

* fix mock test
2018-06-29 19:14:13 +01:00
Owen Cliffe
73d45db443 Fix JSON list responses (#1098) 2018-06-28 01:28:07 +01:00
Rik Gibson
64fb6d27b4 Fixed up a couple of incorrect response codes (#1095)
* Fixed up a couple of incorrect response codes

* Standardise all entities on 204 with no return content on successful delete

* Fix failing Fn.delete() test
2018-06-26 18:17:47 +01:00
Tom Coupland
3ebff051a4 Add support for Function and Trigger domain objects (#1060)
Vast commit, includes:

 * Introduces the Trigger domain entity.
 * Introduces the Fns domain entity.
 * V2 of the API for interacting with the new entities in swaggerv2.yml
 * Adds v2 end points for Apps to support PUT updates.
 * Rewrites the datastore level tests into a new pattern.
 * V2 routes use entity ID over name as the path parameter.
2018-06-25 15:37:06 +01:00
Reed Allman
51ff7caeb2 Bye bye openapi (#1081)
* add DateTime sans mgo

* change all uses of strfmt.DateTime to common.DateTime, remove test strfmt usage

* remove api tests, system-test dep on api test

multiple reasons to remove the api tests:

* awkward dependency with fn_go meant generating bindings on a branched fn to
vendor those to test new stuff. this is at a minimum not at all intuitive,
worth it, nor a fun way to spend the finite amount of time we have to live.
* api tests only tested a subset of functionality that the server/ api tests
already test, and we risk having tests where one tests some thing and the
other doesn't. let's not. we have too many test suites as it is, and these
pretty much only test that we updated the fn_go bindings, which is actually a
hassle as noted above and the cli will pretty quickly figure out anyway.
* fn_go relies on openapi, which relies on mgo, which is deprecated and we'd
like to remove as a dependency. openapi is a _huge_ dep built in a NIH
fashion, that cannot simply remove the mgo dep as users may be using it.
we've now stolen their date time and otherwise killed usage of it in fn core,
for fn_go it still exists but that's less of a problem.

* update deps

removals:

* easyjson
* mgo
* go-openapi
* mapstructure
* fn_go
* purell
* go-validator

also, had to lock docker. we shouldn't use docker on master anyway, they
strongly advise against that. had no luck with latest version rev, so i locked
it to what we were using before. until next time.

the rest is just playing dep roulette, those end up removing a ton tho

* fix exec test to work

* account for john le cache
2018-06-21 11:09:16 -07:00
Andrea Rosa
e637661ea2 Adding a way to inject a request ID (#1046)
* Adding a way to inject a request ID

It is very useful to associate a request ID to each incoming request,
this change allows to provide a function to do that via Server Option.
The change comes with a default function which will generate a new
request ID. The request ID is put in the request context along with a
common logger which always logs the request-id

We add gRPC interceptors to the server so it can get the request ID out
of the gRPC metadata and put it in the common logger stored in the
context so as all the log lines using the common logger from the context
will have the request ID logged
2018-06-14 10:40:55 +01:00
Reed Allman
e848b4c88e should turn on views regardless of exporter (#1059)
https://github.com/fnproject/fn/pull/1058/files#r194913723
2018-06-12 22:45:50 -07:00
Tolga Ceylan
f24172aa9d fn: introducing lb placer basic metrics (#1058)
* fn: introducing lb placer basic metrics

This change adds basic metrics to naive and consistent
hash LB placers. The stats show how many times we scanned
the full runner list, if runner pool failed to return a
runner list or if runner pool returned an empty list.

Placed and not placed status are also tracked along with
if TryExec returned an error or not. Most common error
code, Too-Busy is specifically tracked.

If client cancels/times out, this is also tracked as
a client cancel metric.

For placer latency, we would like to know how much time
the placer spent on searching for a runner until it
successfully places a call. This includes round-trip
times for NACK responses from the runners until a successful
TryExec() call. By excluding last successful TryExec() latency,
we try to exclude function execution & runner container
startup time from this metric in an attempt to isolate
Placer only latency.

* fn: latency and attempt tracker

Removing full scan metric. Tracking number of
runners attempted is a better metric for this
purpose.

Also, if rp.Runners() fail, this is an unrecoverable
error and we should bail out instead of retrying.

* fn: typo fix, ch placer finalize err return

* fn: enable LB placer metrics in WithAgentFromEnv if prometheus is enabled
2018-06-12 13:36:05 -07:00
Owen Cliffe
1ad27f4f0d Inverting deps on SQL, Log and MQ plugins to make them optional dependencies of extended servers, Removing some dead code that brought in unused dependencies Filtering out some non-linux transitive deps. (#1057)
* initial Db helper split - make SQL and datastore packages optional

* abstracting log store

* break out DB, MQ and log drivers as extensions

* cleanup

* fewer deps

* fixing docker test

* hmm dbness

* updating db startup

* Consolidate all your extensions into one convenient package

* cleanup

* clean up dep constraints
2018-06-11 18:23:28 +01:00
Reed Allman
00c29b8bf3 datastore no longer implements logstore (#1013)
* datastore no longer implements logstore

the underlying implementation of our sql store implements both the datastore
and the logstore interface, however going forward we are likely to encounter
datastore implementers that would mock out the logstore interface and not use
its methods - signalling a poor interface. this remedies that, now they are 2
completely separate things, which our sqlstore happens to implement both of.

related to some recent changes around wrapping, this keeps the imposed metrics
and validation wrapping of a servers logstore and datastore, just moving it
into New instead of in the opts - this is so that a user can have the
underlying datastore in order to set the logstore to it, since wrapping it in
a validator/metrics would render it no longer a logstore implementer (i.e.
validate datastore doesn't implement the logstore interface), we need to do
this after setting the logstore to the datastore if one wasn't provided
explicitly.

* splits logstore and datastore metrics & validation logic
* `make test` should be `make full-test` always. got rid of that so that
nobody else has to wait for CI to blow up on them after the tests pass locally
ever again.

* fix new tests
2018-06-04 00:08:16 -07:00
Tomas Knappek
e0425abd19 unmatched api handler reported as 'invalid' (#1028) 2018-06-01 13:49:40 -07:00
Tomas Knappek
e3e264de53 Api metrics (#1014)
* api metrics support

* comments reflected

* metrics middleware fix
2018-05-25 08:31:37 -07:00
Tomas Knappek
8aae1502f0 Admin server for paths which are not part of API (#1011)
* admin server added

* test fixed, ping moved out of admin server

* keeping admin/web port in sync
2018-05-21 10:41:27 -07:00
Gerardo Viedma
ea1f94253f Implement graceful shutdown of agent.DataAccess (#1008)
* Implements graceful shutdown of agent.DataAccess and underlying Datastore/Logstore/MessageQueue

* adds tests for closing agent.DataAccess and Datastore
2018-05-21 11:28:21 +01:00
Tomas Knappek
ccde0d2357 Wrap custom datastore with metrics and validator (#1002)
* Wrap method added to datastore

* datastore formatting fixed
2018-05-17 13:21:36 -07:00
Tomas Knappek
f6d47fd0ed add DELETE to allowed cors methods (#1001) 2018-05-16 14:25:02 -07:00
Reed Allman
cbe0d5e9ac add user syslog writers to app (#970)
* add user syslog writers to app

users may specify a syslog url[s] on apps now and all functions under that app
will spew their logs out to it. the docs have more information around details
there, please review those (swagger and operating/logging.md), tried to
implement to spec in some parts and improve others, open to feedback on
format though, lots of liberty there.

design decision wise, I am looking to the future and ignoring cold containers.
the overhead of the connections there will not be worth it, so this feature
only works for hot functions, since we're killing cold anyway (even if a user
can just straight up exit a hot container).

syslog connections will be opened against a container when it starts up, and
then the call id that is logged gets swapped out for each call that goes
through the container, this cuts down on the cost of opening/closing
connections significantly. there are buffers to accumulate logs until we get a
`\n` to actually write a syslog line, and a buffer to save some bytes when
we're writing the syslog formatting as well. underneath writers re-use the
line writer in certain scenarios (swapper). we could likely improve the ease
of setting this up, but opening the syslog conns against a container seems
worth it, and is a different path than the other func loggers that we create
when we make a call object. the Close() stuff is a little tricky, not sure how
to make it easier and have the ^ benefits, open to idears.

this does add another vector of 'limits' to consider for more strict service
operators. one being how many syslog urls can a user add to an app (infinite,
atm) and the other being on the order of number of containers per host we
could run out of connections in certain scenarios. there may be some utility
in having multiple syslog sinks to send to, it could help with debugging at
times to send to another destination or if a user is a client w/ someone and
both want the function logs, e.g. (have used this for that in the past,
specifically).

this also doesn't work behind a proxy, which is something i'm open to fixing,
but afaict will require a 3rd party dependency (we can pretty much steal what
docker does). this is mostly of utility for those of us that work behind a
proxy all the time, not really for end users.

there are some unit tests. integration tests for this don't sound very fun to
maintain. I did test against papertrail with each protocol and it works (and
even times out if you're behind a proxy!).

closes #337

* add trace to syslog dial
2018-05-15 11:00:26 -07:00
Tomas Knappek
19f09b3a6c Added FN_API_CORS_HEADERS for configuring CORS headers (#997) 2018-05-15 18:03:01 +01:00
jan grant
91e58afa55 The opencensus API changes between 0.6.0 and 0.9.0 (#980)
We get some useful features in later versions; update so as to not
pin downstream consumers (extensions) to an older version.
2018-05-09 14:55:00 +01:00
Denis Makogon
7ee47f13bb Expose Agent (#892)
with server.Agent developers can access more transport-agnostic API to call the functions
2018-05-07 11:10:23 -07:00
Reed Allman
9d721f8327 remove flaky tests (#972)
if we want them back, we can dig them out of git instead of some poor soul
uncommenting them 10 years from now and spending 3 months on failing CI builds
trying to figure out how a test that breaks doesn't mean the code's broke.

these tests are notoriously flaky and hard to understand/fix, they also test
very specific agent behaviors all the way through the front end when it may be
easier to test them in unit tests instead (should we so choose). at least,
since the behaviors tested aren't being changed very often, these are only
serving to provide negative value in time wasted re-running the test suite
[since them failing doesn't really indicate the code being wrong].

the `IOPipes` test is partially covered by `TestPipesAreClear` which hasn't
cropped up as being as flaky, but it tests less behaviors. it is not easy tt o
understand, either. while i think we learned a lot from these tests, they
haven't been a great citizen of our test suite at large, i figure if we need
to change runner behavior in the future we can maybe make another go at it.
2018-05-04 10:30:49 -07:00
Srinidhi Chokkadi Puranik
e0b82519aa Last middleware should use the request passed by preceding middleware. (#965)
This is useful when preceding middleware reads httpRequest.Body to
perform some logic, and assigns a new ReadCloser to httpRequest.Body
(as body can be read only once).
2018-04-30 13:13:24 -07:00
Tolga Ceylan
54ba49be65 fn: non-blocking resource tracker and notification (#841)
* fn: non-blocking resource tracker and notification

For some types of errors, we might want to notify
the actual caller if the error is directly 1-1 tied
to that request. If hotLauncher is triggered with
signaller, then here we send a back communication
error notification channel. This is passed to
checkLaunch to send back synchronous responses
to the caller that initiated this hot container
launch.

This is useful if we want to run the agent in
quick fail mode, where instead of waiting for
CPU/Mem to become available, we prefer to fail
quick in order not to hold up the caller.
To support this, non-blocking resource tracker
option/functions are now available.

* fn: test env var rename tweak

* fn: fixup merge

* fn: rebase test fix

* fn: merge fixup

* fn: test tweak down to 70MB for 128MB total

* fn: refactor token creation and use broadcast regardless

* fn: nb description

* fn: bugfix
2018-04-24 21:59:33 -07:00
Tolga Ceylan
00bb4d1257 fn: empty body tests for cold and hot (json/http) (#941) 2018-04-13 10:35:57 -07:00
Tolga Ceylan
e47d55056a fn: reduce lbagent and agent dependency (#938)
* fn: reduce lbagent and agent dependency

lbagent and agent code is too dependent. This causes
any changed in agent to break lbagent. In reality, for
LB there should be no delegated agent. Splitting these
two will cause some code duplication, but it reduces
dependency and complexity (eg. agent without docker)

* fn: post rebase fixup

* fn: runner/runnercall should use lbDeadline

* fn: fixup ln agent test

* fn: remove agent create option for common.WaitGroup
2018-04-12 15:51:58 -07:00