Log of the #duraspace-ff channel on chat.freenode.net

Using timezone: Eastern Standard Time
* github-ff joins03:52
[fcrepo4] fasseg force-pushed fcrepo-4.0.0-scape from c88c63b to 9d8a5ad: http://git.io/iSQnug
* github-ff leaves
* travis-ci joins04:11
[travis-ci] futures/fcrepo4#1289 (fcrepo-4.0.0-scape - 9d8a5ad : Chris Beer): The build passed.
[travis-ci] Change view : https://github.com/futures/fcrepo4/compare/c88c63b59e4e...9d8a5ad9a920
[travis-ci] Build details : http://travis-ci.org/futures/fcrepo4/builds/14590748
* travis-ci leaves
<bljenkins>Project fcrepo-fixity-corrupter build #503: SUCCESS in 1 min 38 sec: http://ci.fcrepo.org/jenkins/job/fcrepo-fixity-corrupter/503/04:19
Project fcrepo-kitchen-sink build #680: STILL UNSTABLE in 3 min 55 sec: http://ci.fcrepo.org/jenkins/job/fcrepo-kitchen-sink/680/04:23
* fasseg joins05:44
* fasseg leaves05:45
* fasseg joins
<escowles>barmintor|trains: i'm ingesting right now using your reduceLookups branch -- abt. the same as master so far, but noticeably less variability08:27
* mikeAtUVa joins08:55
* osmandin joins09:10
<pivotal-bot>Osman Din started "Introspect the bson output for modeshape schematics and report back how friendly the output is" https://www.pivotaltracker.com/story/show/4901279909:29
<mikeAtUVa>I could test this, but I'd imagine everyone already knows the answer and could just tell me: Can updates to the node type definitons (by posting a new CND file) replace or modify existing types? If so, what happens when existing content is no longer valid according to the current node type definitions?
* ermadmix joins09:42
* kaarefc joins09:43
* tecoripa joins09:47
* gregjansen joins09:52
<pivotal-bot>Frank Asseg added comment: "Projecting over 300GB worked nicely although the first request took ages when the data was first read by the..." https://www.pivotaltracker.com/story/show/6108774009:54
Frank Asseg finished "Federate over large files" https://www.pivotaltracker.com/story/show/61087740
Scott Prater added "Create NodeType page in Adminstration manual on wiki" https://www.pivotaltracker.com/story/show/6155475810:00
* jcoyne joins10:01
* kaarefc leaves10:06
<pivotal-bot>Andrew Woods accepted "put namespaces into serialized streamed RDF" https://www.pivotaltracker.com/story/show/61501904
* fasseg leaves10:08
* mikeAtUVa leaves
<pivotal-bot>Andrew Woods added comment: "Could you document your setup, configuration, and results on the wiki?10:09
https://wiki.duraspace.org/display/F..." https://www.pivotaltracker.com/story/show/61087740
Andrew Woods rejected "Federate over large files" https://www.pivotaltracker.com/story/show/61087740
* fasseg joins
* fasseg leaves
* fasseg joins
* mikeAtUVa joins10:10
* ermadmix leaves10:16
* ermadmix joins10:20
* github-ff joins10:43
[fcrepo4] mikedurbin opened pull request #174: Implemented auto versioning policy, made explicit versioning the default... (master...versioning-6) http://git.io/Sp0ICw10:44
* github-ff leaves
<pivotal-bot>Mike Durbin started "Add alternate versioning policy." https://www.pivotaltracker.com/story/show/61421690
Mike Durbin added comment: "https://github.com/futures/fcrepo4/pull/174" https://www.pivotaltracker.com/story/show/61421690
Mike Durbin finished "Add alternate versioning policy." https://www.pivotaltracker.com/story/show/61421690
Mike Durbin deleted "Turn off versioning by default." https://www.pivotaltracker.com/story/show/6148689410:45
* jcoyne leaves
<pivotal-bot>Scott Prater added "Fedora3/Fedora 4Performance test: 10,000 objs at 1MB apiece" https://www.pivotaltracker.com/story/show/6155818010:51
Scott Prater edited "Fedora3/Fedora 4 Performance test: 10,000 objs at 1MB apiece" https://www.pivotaltracker.com/story/show/61558180
<cbeer>mikeAtUVa: " If so, what happens when existing content is no longer valid according to the current node type definitions?"11:01
mikeAtUVa: Modeshape will complain if you try to change something that breaks existing nodes
<barmintor>are we stand-upping?11:05
<escowles>we are upstanding
<cbeer>gregjansen: can you figure out a better solution to your standup audio? it's been driving me crazy this week for some reason11:27
* ermadmix leaves11:29
<barmintor>tecoripa: another possibility for the deiscrepancy between your tests and benchtool might be disk speed
since I think benchtool just send random data
INFO [main] (BenchToolFC4.java:145) - Processing 1000 objects took 474899 ms11:32
INFO [main] (BenchToolFC4.java:150) - Throughput was 004.31 mb/s
(with the YK agent attached, but not recording snapshot data)
<tecoripa>barmintor: I thought about that; but it's a pretty radical difference: just cat zeros to disk I get about 214MB/s, but fcrepo4 was averaging about 0.48MB/s, or something outrageous like that.
* awoods joins11:33
<cbeer>ajs6f-- # shoulda made him write more tests
<awoods>escowles: This is the ticket I had in mind... it may or may not be related to your page-refresh issue: https://www.pivotaltracker.com/story/show/49934445
<pivotal-bot>bug: Fix last modified calculations to include child node modification data (and sometimes jcr:content under that) (unscheduled) / owner:
<tecoripa>barmintor: I'm wondering more about CPU utilization... my load average crept up to 12 while running the two-thread test, and the machine was pretty sluggish (there was now way I could have participated in the Google Hangout standup while the test was running)11:34
<awoods>cbeer: you mentioned you were going to follow with some ITs. But it is best to have those in the initial commit.
<tecoripa>which makes me think it was maybe the fault of benchtool and /dev/urandom?
<cbeer>awoods: nah, the ITs I was going to add were HTML tests, not about the change he was making11:35
my problem is the whole NamespaceRdfContext doesn't actually do what I want it to.
it adds all those VANN triples in addition to setting the namespace prefixes11:36
<awoods>cbeer: Do you think the NamespaceRdfContext generally includes more triples than it should? or that you have a specific need in a specific scenario to only have certain triples?11:39
<cbeer>awoods: no. i think NamespaceRdfContext does exactly what it should. it just isn't appropriate for doing what we're doing.
right solution to the wrong problem.
<awoods>cbeer: Maybe it was not clear to ajs6f what you were looking for.11:40
<barmintor>awoods: did my comment re: node paths make sense?11:56
<awoods>barmintor: yes, just responding... all looks good.11:57
<barmintor>ok, happy to make any/all of those changes
escowles: were you saying that branch ended up being more-or-less identical throughput? b/c that suggests that this whole thing might be bound by externalities11:58
<awoods>barmintor: comments complete.12:01
<barmintor>awoods: ok, I'm going to make amendments and squash them into a new PR12:02
<awoods>barmintor: great
escowles? ^^12:07
* ermadmix joins12:11
<gregjansen>tecoripa: are you using the kitchen sink build possibly for that 9hr test?12:20
<gregjansen>has anyone else seen a missing jms.xml file when starting fcrepo-webapp with mvn jetty:run? (FileNotFoundException: class path resource [spring/jms.xml])12:21
tecoripa: the results I've been posted are based on fcrepo-webapp, not kitchen sink. I wonder if all the extras are slowing it down..12:22
<awoods>gregjansen/tecoripa: Interesting, is kitchen-sink the test app people are using?!
<tecoripa>gregjansen: I seem to recall that. Have you done a full mvn clean install on the kitchen sink?
wait... I just run fcrepo-webapp... within the fcrepo4 full checkout.
<gregjansen>tecoripa: I am not running the kitchen sink, but I did do a clean build of fcrepo4
<awoods>gregjansen/tecoripa: By definition, kitchen-sink is probably not the ideal test app.
<tecoripa>Is that the kitchen sink?
<awoods>tecoripa: no, it sounds like you are fine. kitchen-sink is another github project.12:23
<gregjansen>tecoripa: nope, good well at least we are on the same page. you mentioned KS yesterday at one point and I thought that could be a reason for your slow down
<awoods>gregjansen: re:jms.xml, no error on my side.12:24
<tecoripa>ah, okay. I learned something new today. I thought kichen sink was the term for fcrepo4.
no, I just run fcrepo-webapp within fcrepo4
to be explicit: here's my command line:
fcrepo-webapp$ MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=1g" mvn jetty:run -Dfcrepo.modeshape.configuration=classpath:config/repository-basic.json
<gregjansen>awoods: jms.xml is being imported in atom_jms.xml from classpath:/spring/jms.xml, but it is not in the local project12:25
awoods: all within fcrepo-webapp12:26
tecoripa: I have not used repository-basic before, just minimal and clustered so far.12:27
<barmintor>gregjansen: You continue to be the default @author in Eclipse. I love it. Everything is your problem.
<gregjansen>barmintor: did I do that? am I hardcoded in templates?12:28
<awoods>gregjansen: give me a link to a line-number in github where jms.xml is being imported.12:29
<gregjansen>awoods: will do
<tecoripa>gregjansen: check out https://wiki.duraspace.org/display/FF/Test+-+Repository+Profile%3A+Basic. You'll see the repository-basic.json file I use there.
<gregjansen>awoods: nm, stale code. I have too many working dirs12:32
<pivotal-bot>Chris Beer started "Provide the already-registered namespaces in the fcr:sparql HTML form " https://www.pivotaltracker.com/story/show/6150085012:38
<tecoripa>gregjansen, others: I was just talking with a coworker about the performance testing12:45
and we are wondering if maybe benchtool is introducing some latency, maybe enough to skew results
especially for the authz/no authz test, all on fcrepo4
if benchtool takes longer to generate the random data and send it off than it takes the server to process it, in any number of sceanrios, then there's no real way to tell what slows down or speeds up the server12:46
<pivotal-bot>Chris Beer estimated "Provide the already-registered namespaces in the fcr:sparql HTML form " as 1 point https://www.pivotaltracker.com/story/show/61500850
* kaarefc joins12:47
<tecoripa>I'm thinking it may be time to do another iteration on benchtool: have it pre-generate a file with random data, then just change the first X bytes (10, 32, whatever) with every POST
that should make it much faster
* github-ff joins
[fcrepo4] cbeer created namespace-prefixes-when-rendering-rdf (+2 new commits): http://git.io/USIGJQ
fcrepo4/namespace-prefixes-when-rendering-rdf d4be47f Chris Beer: remove unnecessarily verbose logging from fcrepo-http-api
fcrepo4/namespace-prefixes-when-rendering-rdf 6c819b2 Chris Beer: make sure namespace prefixes propagate when rendering rdf streams, and ensure they show up in SPARQL-Update and SPARQL-Select forms...
* github-ff leaves
<awoods>tecoripa: Agreed, is there a reason we have not yet pulled the data-generation out of the timing phase of benchtool?12:48
<tecoripa>this morning the slowness I was seeing seemed to be CPU-bound: load avergae was around 12
<gregjansen>tecoripa: that's a good idea.. In theory we should only have to change some bytes for every chunk size span
<pivotal-bot>Chris Beer added comment: "https://github.com/futures/fcrepo4/pull/175" https://www.pivotaltracker.com/story/show/61500850
Chris Beer finished "Provide the already-registered namespaces in the fcr:sparql HTML form " https://www.pivotaltracker.com/story/show/61500850
* github-ff joins
[fcrepo4] cbeer opened pull request #175: Namespace prefixes when rendering rdf (master...namespace-prefixes-when-rendering-rdf) http://git.io/1ObINw
* github-ff leaves
<tecoripa>which makes me wonder if benchtool was eating up all my CPU, making random bits
<gregjansen>awoods: I thought that data gen time was excluded from the elapse ingest time.. but that doesn't mean we can really ignore the impact12:49
<barmintor>we should be able to use the same bytes, and simply prepend the counter value
<tecoripa>I think it hasn't been done yet, simply because we haven't yet reached the point (until now) where we've had enough information to take it into account12:50
<cbeer>i thought that's what we were doing. maybe that was in jmeter-madness?12:51
* osmandin leaves12:55
* osmandin joins
<awoods>I someone taking the action to:
- see what benchtool is currently doing
- change the code (if necessary) to pull data-generation out of timing loop, or
- change the code (if necessary) to create new data by adding a byte or two?
<tecoripa>I'm looking at the benchtool code right now
<pivotal-bot>Gregory Jansen added comment: "I don't see a port conflict, I see a LOCK collision at /tmp/fcrepo4-data/fcrepo.ispn.repo.CacheDirPath/d..." https://www.pivotaltracker.com/story/show/6122572212:57
<tecoripa>yes, it looks like new random data is generated at run time for every datastream POST12:58
<awoods>tecoripa: And is it generated from scratch, or by adding a byte or two?
<tecoripa>by scratch, it looks like
<barmintor>awoods: javadoc changes are in, and IllegalArgException added to that base class13:00
PR is updated with squashed commit
<awoods>thanks, barmintor.
<pivotal-bot>Andrew Woods edited "Reduce calls to #getMixinNodeTypes in FedoraTypesUtils" https://www.pivotaltracker.com/story/show/61411396
<tecoripa>I'll create a ticket: update benchtool to read static random data from file, only change a few bytes on each POST13:01
* github-ff joins13:03
[fcrepo4] awoods pushed 2 new commits to master: http://git.io/jPzx-w
fcrepo4/master 2954c46 Benjamin Armintor: reduce node lookups in event processing
fcrepo4/master b6c6c35 Andrew Woods: Merge pull request #173 from barmintor/reduceLookups...
* github-ff leaves
<pivotal-bot>Andrew Woods delivered "Reduce calls to #getMixinNodeTypes in FedoraTypesUtils" https://www.pivotaltracker.com/story/show/61411396
<tecoripa>also: random number generation IS included in the timing13:04
https://github.com/futures/benchtool/blob/master/src/main/java/org/fcrepo/bench/BenchToolFC4.java#L129 and https://github.com/futures/benchtool/blob/master/src/main/java/org/fcrepo/bench/BenchToolFC4.java#L15313:05
<bljenkins>Project fcrepo4 build #1458: UNSTABLE in 17 min: http://ci.fcrepo.org/jenkins/job/fcrepo4/1458/
<tecoripa>so I'll add that to the ticket, too
<awoods>tecoripa: I would be interested in what others would like/expect, but having the benchtool generate the base data may be easier/more-consistent.13:06
<tecoripa>awoods: I think we'll eventually want to do both.13:07
<awoods>tecoripa: versus everyone using their own local file for seeding the tests.
tecoripa: I am thinking about apples/apples test comparison.
<tecoripa>I was talking with Esme about it a few days ago: for more complex, real-world type tests, we may want to create a template object tree on disk, then have benchtool read it in and post it X number of times.
* travis-ci joins13:08
[travis-ci] futures/fcrepo4#1291 (namespace-prefixes-when-rendering-rdf - 6c819b2 : Chris Beer): The build has errored.
[travis-ci] Change view : https://github.com/futures/fcrepo4/compare/d4be47fe8772^...6c819b2a03db
[travis-ci] Build details : http://travis-ci.org/futures/fcrepo4/builds/14613982
* travis-ci leaves
<tecoripa>but that can be a second pass. I agree: initially, just a standard pre-genartion is the best
<pivotal-bot>Scott Prater added "Improve benchtool performance: use pre-generated data, don't include object preparation in timings" https://www.pivotaltracker.com/story/show/6156857813:11
<tecoripa>awoods: my plate of tickets is beginning to resemble a Thanksgiving meal... especially as what starts off as simple tasks quickly increase in complexity and scope13:12
<awoods>tecoripa: scope it down like a good pilgrim.13:13
<tecoripa>awoods: so if anyone is bored and looking for something to do, I can cheerfully volunteer some of my tickets, while I focus on the performance testing and content modelling.
awoods: yeah, look what the pilgrims started. Talk about scope creep.13:14
<awoods>tecoripa: maybe that analogy took a wrong turn.
* travis-ci joins13:16
[travis-ci] futures/fcrepo4#1295 (master - b6c6c35 : Andrew Woods): The build passed.
[travis-ci] Change view : https://github.com/futures/fcrepo4/compare/9d8a5ad9a920...b6c6c35eb747
[travis-ci] Build details : http://travis-ci.org/futures/fcrepo4/builds/14614932
* travis-ci leaves
<pivotal-bot>Scott Prater added comment: "@gregoryjansen: nope, it's still in /tmp. as I discovered the other day. A quick workaround that I use fo..." https://www.pivotaltracker.com/story/show/61225722
<tecoripa>okay, 1000 objects, benchtool on a different host, much better performance:13:18
INFO [main] (BenchToolFC4.java:114) - ingesting 1000 objects with datastream size 1048576
INFO [main] (BenchToolFC4.java:127) - Initial cluster size is 1
100.00% - ingest finished
INFO [main] (BenchToolFC4.java:155) - Processing 1000 objects took 3316082 ms
INFO [main] (BenchToolFC4.java:160) - Throughput was 000.31 mb/s
still pretty miserable, though: 000.31 mb/s
<awoods>I wonder why escowles did not run ingest with 1000 objects: https://wiki.duraspace.org/display/FF/Test+Results+Summary13:21
<escowles>awoods: i was particulary interested in change of ingest rate as the repository size grew, so decided to go with more granular batch size13:22
but it is incongruent -- maybe i should reingest with 1000 object batches for a clearer summary?
* kaarefc leaves
<awoods>escowles: It would be interesting to see if you get the same "miserable" results that tecoripa is seeing.13:23
<escowles>maybe there's something about the batch size that's hurting performance? interesting idea -- maybe fcrepo4 just needs to take a break now and then...13:24
<cbeer>escowles: i did not see that behavior in my testing with a real-world data set.
<awoods>escowles: did you see the question from barmintor from about 1.5 hours ago?
<tecoripa>escowles: here's my profile: https://wiki.duraspace.org/display/FF/AuthZ+-+No+AuthZ+Fedora+4+Comparison+Performance+Testing13:25
<escowles>awoods: nope, scrolling back...
<bljenkins>Yippie, build fixed!
Project fcrepo4 build #1459: FIXED in 20 min: http://ci.fcrepo.org/jenkins/job/fcrepo4/1459/
armintor: reduce node lookups in event processing
<escowles>awoods/barmintor: yes, i'm seeing broadly similar performance for the reducedLookups branch13:26
<barmintor>escowles: Lame.
<escowles>and i agree that externalities like network bandwidth and disk I/O are probably going to be the biggest limiting factors
but there are some cases where fcrepo3 does much better, so i think it's not just net/disk I/O13:27
<barmintor>escowles: what are those cases?
<bljenkins>Project fcrepo-fixity-corrupter build #504: SUCCESS in 1 min 27 sec: http://ci.fcrepo.org/jenkins/job/fcrepo-fixity-corrupter/504/
<escowles>barmintor: reads and deletes13:28
<barmintor>escowles: ok, haven't really profiled those ops yet
<tecoripa>cbeer: you have a few minutes to talk about node types and validations?13:29
<escowles>read performance is about the same, but with fcrepo3 having a small advantage
<tecoripa>cbeer: here's the approach I'm taking:
cbeer: 1) create a CND file, POST it to register your new type.
<pivotal-bot>Gregory Jansen added comment: "Thanks Scott! I solved it by passing system properties to the jetty container, which in this case happens..." https://www.pivotaltracker.com/story/show/61225722
Gregory Jansen added comment: "somewhat related, I am thinking of having cargo start the webapp for cluster testing as well. Still ponde..." https://www.pivotaltracker.com/story/show/6122572213:30
<escowles>deletes are very fast in any case (~11ms per obj in fcrepo3 and ~40ms in fcrepo4)
<pivotal-bot>Gregory Jansen added comment: "@awoods PR updated" https://www.pivotaltracker.com/story/show/61225722
<gregjansen>bbi 1hr13:31
<pivotal-bot>Andrew Woods added comment: "@md, can you also please add to the documentation how to configure implicit versioning? What effect does th..." https://www.pivotaltracker.com/story/show/6142169013:32
<bljenkins>Project fcrepo-kitchen-sink build #681: STILL UNSTABLE in 5 min 15 sec: http://ci.fcrepo.org/jenkins/job/fcrepo-kitchen-sink/681/
<cbeer>tecoripa: with you so far (although I think we'd be better not conflating CNDs and validation too much)
<tecoripa>cbeer: 2) Create an object, add the new custom type to it. is there an example of doing that anywhere on the wiki?13:33
<barmintor>escowles: one thing I'm seeing is that th FileCacheStore does a ton of refresh/purging. Each one is small, but, for example, it was invoked 100k times during the run of 1k objects13:34
<tecoripa>cbeer: yeah, that gets to the nub of my questions: when would/should/could validation happen?
<barmintor>escowles: that is probably running in the bg of reads as well
<bljenkins>Yippie, build fixed!
Project fcrepo-jms-indexer-pluggable build #294: FIXED in 8 min 48 sec: http://ci.fcrepo.org/jenkins/job/fcrepo-jms-indexer-pluggable/294/
<tecoripa>cbeer: right now, for example, Fedora 3 content models are just descriptive, not prescriptive, unless you use Asger's enhanced content models
<escowles>barmintor: so maybe it's doing that after writing each block of data?13:35
<pivotal-bot>Andrew Woods added comment: "To be clear, from the description in this ticket, "Implement a version policy that only creates versions wh..." https://www.pivotaltracker.com/story/show/61421690
<tecoripa>cbeer: so I syuppose someone could just add the new node type to an object, and stop there.
but supposing they wanted to validate the object...13:36
<barmintor>escowles: not sure yet, but it's apparently ~16% of the time on CPU now
<tecoripa>cbeer: how would that happen?
<barmintor>escowles: biggest single resource usage left by order of magnitude
<tecoripa>cbeer: does it depend on the properties set in the CND?
<pivotal-bot>Andrew Woods rejected "Add alternate versioning policy." https://www.pivotaltracker.com/story/show/61421690
Andrew Woods accepted "Reduce calls to #getMixinNodeTypes in FedoraTypesUtils" https://www.pivotaltracker.com/story/show/61411396
<tecoripa>cbeer: and if so, would the proper workflow be to build up your object FIRST, then add the CND with the constraints, and fcrepo4 will automatically validate it?13:37
cbeer: handing the mike over to you
* ermadmix leaves13:38
<cbeer>tecoripa: thinking. mostly about why CNDs seem like a poor solution to this problem. i suppose because, at least in a hydra context, we spend a lot of time building up object programmatically13:39
<escowles>barmintor: i wonder if there is some buffer size we can tune -- our typical files might be larger than the typical JCR content size
<tecoripa>cbeer: yeah, and we're moving that way, too, with the REST API bundled in transactions, rather than the old-fashioned FOXML ingest (which was easier to validate up front)13:40
<cbeer>tecoripa: i think ajs6f has some interest ideas about post-ingest "validation" with OWL, but that seems like an on-demand feature and probably brings with it baggage (or, at the very least, the same limitations you get from OWL validation in the LD world.)13:43
<tecoripa>cbeer: yes, on-demand was what I was wondering about: I remember hearing some noise from ajs6f about this last week. I imagine that would be a TBD REST endpoint?13:44
cbeer: so is there "automatic" validation that goes on right now? does fcrepo check and make sure your new object conforms to the types assoicated with it? does it complain if you add a new Type, and suddenly the object is invalid?13:45
cbeer: (or maybe these are all questions I should answer with experimentation and empirical observation... ?)13:46
<cbeer>tecoripa: modeshape enforces the CND. if the node type says it has a mandatory property, it'll refuse to save the session without it.
which is a much harsher form of validation and should be wielded with care.
<tecoripa>cbeer: okay, good to know. So adding a new nodetype should be the LAST step in a transactional workflow.
<cbeer>except the node types are also what allows it to have certain properties in the first place.13:47
<tecoripa>cbeer: arggh. I repeat: arrgh.
<escowles>how does that work with a required property that's defined by a CND?
<cbeer>so that's a little weird. i guess you could introduce an object as a e.g. fedora:resource that allows any type of property, and then restrict it at the end of the workflow
<tecoripa>cbeer: how does one avoid that chicken-and-egg problem? avoid mandatory properties?13:48
<cbeer>tecoripa: again, that's why i doubt CNDs are the right solution to validation.. but i guess we need to define what we mean by validation in the first place.
<awoods>barmintor/escowles: Do you think these default values come into play from the repository.json: https://github.com/ModeShape/modeshape/blob/modeshape-3.5.0.Final/modeshape-jcr/src/main/resources/org/modeshape/jcr/repository-config-schema.json#L137
<tecoripa>cbeer: right okay. so you have a permissive nodeType for use when working, then a restricted one for the final product? that seems suboptimal.
<barmintor>awoods: this actually appears to be a scheduled task13:49
<tecoripa>cbeer: it sounds like we have a problem that we get validation whether we want it or not. Unless we avoid mandatory properties.
<escowles>awoods: i'm not sure -- all of our data files are larger than 4KB, so i don't know if that comes into play13:50
<barmintor>awoods/escowles: I think the timing is inferred from the eviction policy
<cbeer>tecoripa: well, avoid mandatory properties that aren't /really/ mandatory.
<tecoripa>cbeer: and also avoid the ones that are. At least, until you have them in the object.13:51
<cbeer>tecoripa: also, looking at the stub page on the wiki (https://wiki.duraspace.org/display/FF/Design+-+Validation), I don't think CNDs solve any of those problems
<tecoripa>cbeer: yes, that validation seems to be more about content. CNDs are more like content models: they're all about structure.13:52
<mikeAtUVa>awoods, escowles: Should we make a decision about separating "create version" from "set version label" before I can finish this ticket?
<tecoripa>cbeer: I propose that there are two issues at work here: structure validation and content validation.
<awoods>mikeAtUVa: yes. For ALL, here is the question...13:53
Should adding a label to a node via the REST api also create a version of that node (if versioning is turned on, of course)?13:54
<tecoripa>cbeer: I need to take off and do some meal shopping. I'll be back online later, this evening.
<escowles>mikeAtUVa: my vote is: adding labels shouldn't create a new version -- those two operations should be separate13:55
* gregjansen leaves
* ermadmix joins
<cbeer>awoods: if versioning is on, is there already a version of the current state of the node?13:56
i'm not sure the question makes sense
<mikeAtUVa>cbeer: no
<awoods>mikeAtUVa: that depends, no?
<mikeAtUVa>cbeer: that label would be applied to the last version checkpoint made
<cbeer>mikeAtUVa: which, if versioning is on, is any time you change the object, righT?13:57
<tecoripa>cbeer: I'll start off with the simple stuff: creating the CND, ingesting it as a node type, and setting an object's node type. I'll also plan on documenting the mix-in approach, multiple node types on a single object. We can address validation later, though I'll make a note about being careful about setting mandatory properties.
<mikeAtUVa>cbeer: well not anymore...
cbeer: that policy (though present in fedora 3) kind of sucks...
<tecoripa>cbeer: if I have any questions, I'll post them to ff-tech, or try to catch you on IRC.
<cbeer>mikeAtUVa: ok. so we're not talking modeshape versioning any more?
<mikeAtUVa>cbeer: there is a way (setting a property on a node) to have that sort of policy enforced, but I wanted to move to a place where versions are made only when requested13:58
<awoods>tecoripa: fedora-tech ;)
<cbeer>mikeAtUVa: got it. then, yes, isn't adding a label requesting a new version?
<mikeAtUVa>cbeer: We are... but we're talking about how and when our code invokes the modeshape code that creates a new checkpoint.
<tecoripa>awoods: got it. :)
see you all later.
<cbeer>if i say this object is tagged as version 0.0.1, i want the current state of the object tagged as 0.0.1
not whatever state it happened to be in when someone requested a version13:59
<mikeAtUVa>tecoripa: happy thanksgiving
<awoods>mikeAtUVa/cbeer/escowles: would this save time by having a 5min hangout?
<tecoripa>mikeAtUVa: thanks, you too. and same to all the other yankees online.14:00
<mikeAtUVa>awoods: probably
* tecoripa leaves
cbeer: are you available?14:01
<cbeer>awoods: no, sorry.14:02
* mikeAtUVa leaves
* mikeAtUVa joins14:04
<awoods>cbeer/mikeAtUVa/escowles: In summary, decision is to have explicit endpoint for creating a version (even if you do not have your F4 setup to auto-version) and that endpoint will add a label to the new version if a label is a part of the POST request.14:18
<pivotal-bot>Mike Durbin edited "Add alternate versioning policy." https://www.pivotaltracker.com/story/show/6142169014:20
Andrew Woods delivered "Provide the already-registered namespaces in the fcr:sparql HTML form " https://www.pivotaltracker.com/story/show/6150085014:26
* github-ff joins
[fcrepo4] awoods closed pull request #175: Namespace prefixes when rendering rdf (master...namespace-prefixes-when-rendering-rdf) http://git.io/1ObINw
* github-ff leaves
<cbeer>awoods: sounds good to me
<pivotal-bot>Chris Beer accepted "Provide the already-registered namespaces in the fcr:sparql HTML form " https://www.pivotaltracker.com/story/show/61500850
<awoods>cbeer: I am glad, we missed you.
cbeer: As for: https://www.pivotaltracker.com/story/show/6105760814:27
<pivotal-bot>feature: Run fcrepo3 benchmarks on Stanford server (rejected) / owner: Chris Beer
<awoods>cbeer: At some point, it would be nice to have a summary of your tests at the top of that page... or do you feel like that is redundant?14:28
cbeer: rather, a summary of your test "results"
<cbeer>awoods: i think it's fine to have a summary, once we actually know what's interesting.
(and i don't think i'm there yet)14:29
<awoods>cbeer: I will go ahead and push that ticket through in the meantime, then.
<pivotal-bot>Andrew Woods accepted "Run fcrepo3 benchmarks on Stanford server" https://www.pivotaltracker.com/story/show/61057608
* travis-ci joins14:40
[travis-ci] futures/fcrepo4#1296 (master - 2707674 : Andrew Woods): The build passed.
[travis-ci] Change view : https://github.com/futures/fcrepo4/compare/b6c6c35eb747...2707674c86e5
[travis-ci] Build details : http://travis-ci.org/futures/fcrepo4/builds/14618754
* travis-ci leaves
<bljenkins>Project fcrepo-fixity-corrupter build #505: SUCCESS in 1 min 14 sec: http://ci.fcrepo.org/jenkins/job/fcrepo-fixity-corrupter/505/14:47
Project fcrepo-kitchen-sink build #682: STILL UNSTABLE in 2 min 54 sec: http://ci.fcrepo.org/jenkins/job/fcrepo-kitchen-sink/682/14:50
Project fcrepo-jms-indexer-pluggable build #295: UNSTABLE in 6 min 34 sec: http://ci.fcrepo.org/jenkins/job/fcrepo-jms-indexer-pluggable/295/14:53
<barmintor>if I put something in fcrepo-webapp/src/main/resources, will it get copied into WEB-INF/classes/ after the build?14:57
* gregjansen joins15:21
<barmintor>escowles: I was able to bump ingest throughput up ~18% by changing ISPN config properties15:33
<escowles>barmintor: nice!
awoods: i've finished ingesting a few 1000-object batches, and the times are about 10% higher than 10x the 100-object batch average15:40
<awoods>escowles: Do you see a gradually increasing trend per item?15:41
<escowles>no -- it was about the same for the three batches
i'll do f3 now and compare
<awoods>barmintor: I look forward to seeing your magic ISPN config.15:42
<barmintor>INFO [main] (BenchToolFC4.java:145) - Processing 1000 objects took 298900 ms15:45
INFO [main] (BenchToolFC4.java:150) - Throughput was 006.85 mb/s
^^ improved over 4.31 mb/s
but that one was with a singleton store15:46
that said, singleton store is the apples-to-apples comparison15:47
escowles: can I send you a gist of the ispn config, and see what your numbers are like?15:48
<escowles>barmintor: sure
<barmintor>escowles: https://gist.github.com/barmintor/80aa53b6ac7584ee3e3d15:49
<pivotal-bot>Osman Din added comment: "Continuing to document findings here: https://wiki.duraspace.org/display/FF/ModeShape+Artifacts+Layout16:18
But pl..." https://www.pivotaltracker.com/story/show/49012799
Osman Din started "Fix integration test AtomJMSIT exception related to truncated messages" https://www.pivotaltracker.com/story/show/6115284216:23
<barmintor>I am rapidly becoming the only person left on the floor16:26
<pivotal-bot>Andrew Woods added "Create objects from CND examples" https://www.pivotaltracker.com/story/show/6158000816:29
Andrew Woods added comment: "@gregoryjansen, The PR looks good and it even builds. Is this ready to ship?
I have created a follow-on ti..." https://www.pivotaltracker.com/story/show/61225722
Andrew Woods added comment: "Please "Finish" ticket if ready." https://www.pivotaltracker.com/story/show/6122572216:30
<awoods>barmintor: Have you been napping?16:31
<barmintor>No, I've been playing with ISPN configs
with naps in between
awoods: I want to see the profiling for that ISPN config I gist'ed earlier to figure out what the next hotspot to work on should be. It's in-progress.16:35
<awoods>barmintor: Is escowles running it?
<escowles>awoods: yep, just finished baseline, starting to run barmintor 's new ispn config now16:36
<barmintor>awoods: If he's running anything, it's throughput tests
<awoods>barmintor: How long do your profiling runs go?
<barmintor>awoods: It varies. I'm hoping this finishes before 5:30, so I can upload it and look at the reports this weekend.16:37
<awoods>barmintor: Or are you just saying that you will know what to tackle next after the run completes... and in the meantime you are going to get off the floor?
<barmintor>previously, throughput w/ the profiler on was ~0.5mb/s… so it takes a while16:39
<awoods>barmintor: It sounds like you are making strides, which is great.
<barmintor>some progress. I'll try to start looking at reads next week, I guess16:40
<awoods>any news on your front, cbeer?
<cbeer>awoods: no news. back to struggling with clustering.
<awoods>cbeer: I am glad you are looking at the clustering... I feel like there are a lot of issues awaiting us on that front.16:41
<pivotal-bot>Benjamin Armintor added "Run profiler on singleton cachestore on 2954c467f" https://www.pivotaltracker.com/story/show/6158055016:42
Benjamin Armintor started "Run profiler on singleton cachestore on 2954c467f" https://www.pivotaltracker.com/story/show/61580550
<awoods>cbeer: At a minimum, it would be nice to show improved throughput (or other performance metric) with the addition of cluster servers. That is ultimately the promise we have been holding for the clustering capability.16:43
<escowles>barmintor: first 1000-obj batch done with new ispn config: 17% faster than stock17:00
<pivotal-bot>Gregory Jansen added comment: "I will make more models for UNC later, but I want to finish this so that Scott can use it." https://www.pivotaltracker.com/story/show/6122572217:01
Gregory Jansen finished "Create project with example content models" https://www.pivotaltracker.com/story/show/61225722
<barmintor>escowles: seems like a promising direction.
<pivotal-bot>Gregory Jansen finished "Run benchmark tests for single Fedora node on server hardware." https://www.pivotaltracker.com/story/show/60983816
<escowles>barmintor: most def -- i'll ingest a few thousand objs to make sure the trend holds, but i think i'll try a smaller datastream size (10MB?) to see how much that impacts things17:04
<barmintor>oh, was that the 50M payload?
<escowles>barmintor: yep, i've been using 50MB datastreams for all my tests17:08
which might have something to do with net/disk I/O being the main bottleneck -- larger datastreams probably deemphasize the overhead
anyway, gotta run -- happy thanksgiving & hanukkah everyone!17:09
<pivotal-bot>Andrew Woods delivered "Create project with example content models" https://www.pivotaltracker.com/story/show/6122572217:12
Eric James added "fix error adding a solr doc via fcrepo-jms-indexer-pluggable" https://www.pivotaltracker.com/story/show/6158222817:17
Eric James started "fix error adding a solr doc via fcrepo-jms-indexer-pluggable" https://www.pivotaltracker.com/story/show/6158222817:18
Andrew Woods accepted "Create project with example content models" https://www.pivotaltracker.com/story/show/6122572217:21
Andrew Woods added comment: "Do you have a link or some indication of the product of this ticket?" https://www.pivotaltracker.com/story/show/6098381617:22
Andrew Woods rejected "Run benchmark tests for single Fedora node on server hardware." https://www.pivotaltracker.com/story/show/60983816
* ermadmix leaves17:23
* github-ff joins17:30
[fcrepo4] osmandin opened pull request #176: Removed Abdera; introduced ROME for ATOM. (master...rome) http://git.io/CWHm1w
* github-ff leaves
<pivotal-bot>Osman Din added comment: "PR https://github.com/futures/fcrepo4/pull/176" https://www.pivotaltracker.com/story/show/6115284217:31
Osman Din added comment: "@awoods I can send a message to the mailing list." https://www.pivotaltracker.com/story/show/6115284217:33
Osman Din finished "Fix integration test AtomJMSIT exception related to truncated messages" https://www.pivotaltracker.com/story/show/61152842
Osman Din added comment: "@awoods Should I try to include a link to sample code for reading these files (see cbeer's ruby/python referen..." https://www.pivotaltracker.com/story/show/4901279917:34
* osmandin leaves17:37
<pivotal-bot>Andrew Woods added comment: "@osmandin, It looks like you are starting with the LevelDB setup, which may be trickier to inspect. Whateve..." https://www.pivotaltracker.com/story/show/49012799
<barmintor>ok, I had t take an early snapshot (was only about 33% through), but it's up on AWS17:43
will try to have a look this weekend
<awoods>barmintor: thanks.17:44
I am heading to the grocery...17:46
<pivotal-bot>Gregory Jansen added comment: "https://wiki.duraspace.org/display/FF/Test+-+Platform+Profile%3A+Single+VM+at+UNC+Chapel+Hill" https://www.pivotaltracker.com/story/show/6098381618:17
Gregory Jansen finished "Run benchmark tests for single Fedora node on server hardware." https://www.pivotaltracker.com/story/show/60983816
* gregjansen leaves18:19
<pivotal-bot>Andrew Woods accepted "Run benchmark tests for single Fedora node on server hardware." https://www.pivotaltracker.com/story/show/6098381618:56
* md5wz__ joins19:13
* mikeAtUVa leaves19:15
* nbanks joins19:23
* escowles leaves19:25
* md5wz__ leaves19:27
* md5wz__ joins19:28
* md5wz__ leaves20:37
* nbanks leaves20:58