Log of the #fcrepo channel on chat.freenode.net

Using timezone: Eastern Standard Time
<dbernstein>[Import/Export Standup]04:40
Finished yesterday:
* Nothing finished
Working on today:
* https://jira.duraspace.org/browse/FCREPO-2408
(make verification tool work with BagIt Bags)
* dbernstein leaves
* youn joins08:03
* coblej joins
* dwilcox joins08:05
* awoods joins08:11
<youn>[Import/Export Standup]08:18
Finished yesterday:
- https://jira.duraspace.org/browse/FCREPO-2369 (Import-export verification tool's rdf comparison fails on unicode-escaped characters) - used sample RDF containing Unicode to test the verification tool with the one click for 4.6.0 and 4.7.3-RC
- started proposal for videos08:19
Working on today:
- finish proposal for videos
- none
* westgard joins08:23
awoods: just FYI the irclogs server is back up. It seems it was brought back online on Friday.08:24
There are three days (Tu-Th) missing from the logs due to the outage.
* umgrosscol joins08:42
* coblej leaves08:50
* dhlamb joins09:00
* coblej joins09:01
<westgard>[Import/Export Standup]09:04
Finished yesterday:
* whikloj joins
<westgard> started https://jira.duraspace.org/browse/FCREPO-2457 (handle failed connection gracefully)
Working on today:
finish FCREPO-2457
* manez joins09:05
<coblej>[Import/Export Standup]09:09
Finished yesterday:
Working on today:
Document import structural expectations — https://jira.duraspace.org/browse/FCREPO-2455
<awoods>westgard: IRC: yes, I saw that. Thank for keeping me posted.09:11
* benpennell joins09:12
* mikeAtUVa joins09:13
<awoods>[Import/Export Standup]09:14
Finished last Friday:
- Reviewed tickets
Working on today:
- Reviewing tickets
- None
<mikeAtUVa>[import/export sprint standup]
- mostly worked on urelated things
- Finish up https://jira.duraspace.org/browse/FCREPO-2461
- Determine if more changes to fcrepo are in order (for versioning) in the scope of this sprint
- none
<escowles>[Import/Export Standup]
* Exporting members based on inbound links https://jira.duraspace.org/browse/FCREPO-2453
* Fix Bag import using config file https://jira.duraspace.org/browse/FCREPO-246209:16
* Rebase PR #84 (exporting members based on inbound links)
* Not sure after that: whatever needs help
* None
<benpennell>[Import/Export Standup]09:17
Finished yesterday:
* Create PR for export of versions https://jira.duraspace.org/browse/FCREPO-2458
Working on today:
* Planning and implementation of version import https://jira.duraspace.org/browse/FCREPO-2459
* Some local stuff
* none
<awoods>escowles/lsitu: it sounds like you are both needing new tickets, no?09:19
<escowles>awoods: yes — there are some low-priority tickets in the backlog i could pick up, but let me know what's the most important
* yamil joins09:21
<awoods>escowles/lsitu: any of the "minor" priority tickets are fair game... it would also be good to know if mikeAtUVa or benpennell needed any help.09:22
<benpennell>mikeAtUVa: Before I go too far down this path I wanted to make sure it makes sense. It seems like importing versions will require performing a full import of each version path one after the other to the same destination, with a "create version" call between each one. That will need to happen before importing the current version.09:24
mikeAtUVa: items that disappear between versions will need to be deleted, so there would need to be some diffing. updating binaries and triples could probably both be done in a dumb way where it keeps reimporting them for each version09:25
<mikeAtUVa>benpennell: yes, that's the gist... you're replaying the history in chronological order.
<benpennell>mikeAtUVa: since there's no diff between versions i'm not sure if the order would matter while we're unable to set timestamps, but yeah, no reason not to go in order09:26
<mikeAtUVa>benpennell: yeah, diffing children would be more efficient than the naive approach of just blowing away all children after making the version.09:27
* bseeger joins
<benpennell>mikeAtUVa: yeah, spotting changes in binaries would be reasonable if a little expensive without a bag manifest. i think there might already be some utilities for figuring out if properties match, although i don't recall where presently09:29
<mikeAtUVa>benpennell: I would preserve the order to preserve as many assumptions about versions as we can in the modeshape representation so that when it comes to aligning them with the spec, we'll have more options.
benpennell: I would probably draw the line at diffing where you *have* to for now... ie, the presence or absence of a child, for triples, just PUT the new RDF.09:30
<benpennell>mikeAtUVa: okay i will move ahead with this approach and see what's reasonable
* lsitu joins09:45
* peichman joins09:46
<lsitu>[Import/Export Standup]09:49
Finished yesterday:
Import of Bags should verify binary digest: https://jira.duraspace.org/browse/FCREPO-2418
Consolidate the findRepositoryRoot method for PR with round-tripping with binaries excluded: https://jira.duraspace.org/browse/FCREPO-2426
Working on today:
Improve Import of Bags should verify binary digest: https://jira.duraspace.org/browse/FCREPO-2418?
Will find more tickets.
* bseeger leaves09:50
<awoods>benpennell/coblej/escowles/lsitu/mikeAtUVa/westgard/youn: reminder: import/export sprint meeting @3pm ET... use the hangout link in the calendar invite.09:56
<coblej>awoods: got it ... I expect to be there09:57
* github-ff joins10:13
[fcrepo-import-export] escowles force-pushed inbound from 34fd405 to a654538: https://git.io/vHtYs
fcrepo-import-export/inbound a654538 Esmé Cowles: Adding support for exporting inbound references
* github-ff leaves
* manez leaves10:26
* github-ff joins10:33
[fcrepo-import-export] awoods pushed 1 new commit to master: https://git.io/vHtsl
fcrepo-import-export/master c8f913f lsitu: Added support for round-tripping with binaries excluded. (#82)...
* github-ff leaves
* github-ff joins10:58
[fcrepo-import-export] escowles force-pushed inbound from a654538 to 1e0af71: https://git.io/vHtYs
fcrepo-import-export/inbound 1e0af71 Esmé Cowles: Adding support for exporting inbound references
* github-ff leaves
<youn>coblej/westgard/awoods: Is there anything I should be working on? Thanks!11:18
<coblej>youn: I am working https://jira.duraspace.org/browse/FCREPO-2455 and having some difficulty getting my head around what would really be useful there. I hope to get an initial pass at something done sometime this afternoon and it would be useful to have someone look at what I've done.11:21
youn: Or if you want to look at the ticket in advance of that and have any suggestions for what would be helpful11:22
* manez joins11:23
<westgard>youn, coblej: My sense is that import/export is defining what "fedora's requirements" are when it comes to resources serialized to disk.11:24
* benpennell leaves11:25
<westgard>I think it is even less clear what those requirements might be for resources that were never previously part of Fedora.
* awoods going into a series of meeting... will talk to you all at 3pm ET11:26
<escowles>lsitu: i'm going to get some lunch now, but i can take a look at https://jira.duraspace.org/browse/FCREPO-2467 after that
<lsitu>escowles: Thanks.11:27
<coblej>westgard, youn: yeah, that's kind of what I'm struggling with. I probably could try to document the export format but it feels like a pretty big task to try to document in detail how to construct an import package from scratch (except to say make it look like what export does)11:28
* benpennell joins
<westgard>Does one mimick the pair tree? How does one go about assigning URIs or can resources not previously in Fedora be given placeholder identifiers? If the latter, then import would need to know how to recognize those and allow Fedora to assign URIs.
* github-ff joins11:29
[fcrepo-import-export] awoods deleted inbound at 1e0af71: https://git.io/vHt43
* github-ff leaves
<westgard>I think it is safe to say that that part of this problem is out of scope for the ticket you are working on, jcoble.
sorry coblej
<coblej>westgard, youn: yeah11:30
westgard, youn: I think it would be doable to document the export format in terms of the files and directory structures it creates ... I'll probably start with that and see if there is anything useful to do beyond that11:32
westgard, youn: FWIW, will be away from computer for an hour or so11:33
<westgard>coblej: sounds good11:34
<youn>westgard/coblej: Sorry, someone here had a question for me ... I was thinking about the Perseids use case and wondering if they need a way to get from bags for individual resources to something resembling the bag generated by the import export tool, if that's the approach they should take.11:38
In that case, it might be helpful for Perseids to know what the bag files are (and what they contain) and how the directory structure in the data directory contained in the bag resembles the URI paths in the repository and that the RDF resides in the directory for the resource ...
<westgard>youn: I think you are correct that is the approach they could take here. The problem I see is really that question of URIs.11:42
If the resources have never been in Fedora, knowing how to structure the resources on disk is really impossible.11:43
<youn>westgard: Good point. I suppose they could use some kind of node generator. I hope Fedora would accept the nodes!11:44
* coblej leaves11:45
* dbernstein joins11:58
<youn>westgard: My understanding is that the pair tree helps distribute resources. If they had some other means of coming up with a balanced tree that did not necessarily used auto-generated identifiers, would that be okay? Would Fedora need to be fooled into thinking it was importing one of its own bags? Order of import is handled through placeholders, right?12:09
* lsitu leaves12:11
* lsitu joins
* dbernstein leaves12:26
* manez leaves
<westgard>youn: Yes, a user could either come up with their own way of balancing the tree or just POST directly to whatever containers and structure one wanted to use. But on the other hand, I think it's something of a best practice to let Fedora assign URIs and create the pairtree, so I feel it would be good if import via this method supported allowing Fedora to assign locations.12:27
* dbernstein joins12:29
<lsitu>awoods/escowles: It looks like there is a bug with the last commit for “Adding support for exporting inbound references (#84) ”. I only get the root container exported with the following command:12:38
java -jar target/fcrepo-import-export-0.1.1-SNAPSHOT.jar -m export -b -d data -g default -G metadata.yml -r http://localhost:8686/f4/rest
<awoods>lsitu: hmmm... that's not good12:43
<escowles>i'll take a look
* coblej joins12:45
* dbernstein leaves
<escowles>lsitu: yes, i think it's another variation on the problem of having the trailing slash or not12:46
GET http://localhost:8080/rest => returns RDF with subject http://localhost:8080/rest/ (with trailing slash)
* dbernstein joins12:47
<lsitu>escowles: Yep, it works with the trailing slash.12:49
<escowles>this is the problem line: https://github.com/fcrepo4-labs/fcrepo-import-export/blob/master/src/main/java/org/fcrepo/importexport/exporter/Exporter.java#L34312:51
we need to make sure that parent ends with a slash there (and for that matter, it must *not* end with a slash here: https://github.com/fcrepo4-labs/fcrepo-import-export/blob/master/src/main/java/org/fcrepo/importexport/exporter/Exporter.java#L349)
so, subjects must end with slashes and objects must not12:52
<awoods>lsitu/escowles: I can imagine the trailing slash issue to be an ongoing source of confusion for users of the tool.
<coblej>youn, westgard: it would not surprise me if you could specify the node paths of your choice as long as they didn't conflict with anything already in Fedora ... if I get a chance, I'll see if I can verify that ... I kind of think the harder problem is knowing exactly what properties to set on the RDF nodes
<escowles>awoods: agreed — and when we're dealing with resources, i think we can add or remove the slashes as fits the situation
* coblej leaves13:00
* coblej_ joins13:01
<youn>Do you have any inverse predicates?13:03
* manez joins13:27
* manez leaves13:32
* ksclarke joins13:40
* coblej_ leaves14:01
* coblej joins14:07
* manez joins14:28
* manez leaves14:33
<westgard>youn: could you mark https://jira.duraspace.org/browse/FCREPO-2369 as 'ready for test' and assign me as the reviewer?14:38
I have already reviewed and tested and consider this issue suffciently resolved by the two PRs that you made.14:39
<youn>westgard: done. thanks.14:40
* bridgetalmas joins14:59
* umgrosscol leaves15:18
* umgrosscol joins
<escowles>i made a ticket for verifying exported external content: https://jira.duraspace.org/browse/FCREPO-247115:22
* manez joins
<bridgetalmas>hi all. I have to jump off. As I'm not really a core contributing member, I didn't want to interrupt the conversation. I joined the call today mainly to see if there were questions I could answer on issue #https://jira.duraspace.org/browse/FCREPO-2444. Feel free to email me if there is more I can do on that.15:24
Participating in this effort has been really helpful to me in my planning for Perseids. Thank you for including me! I think the Research Object Bundle BagIt profile doesn't make alot of sense for the import/export tool so I'm going to focus on the basic LDP model for my import/export scenario.
Let me know if there is any more I can do.15:25
* bridgetalmas leaves15:55
* westgard leaves15:59
<escowles>gotta run — talk to y'all later16:06
<youn>are inbound references, statements with the child as subject?16:10
<awoods>youn: they are resources with the "collection" as the object16:11
youn: the members have triples that reference the collection... vs. the other way around16:12
<youn>Can they refer to any ancestor node, not just the collection, e.g., from pcdm:FileSet to pcdm:Object?16:14
* youn leaves16:19
* westgard joins16:41
awoods: sorry I had to bolt from the sprint meeting at 416:42
awoods, dbernstein: I missed Danny's assessment of where we stand on the verifcation tool. Could you summarize briefly?16:43
<dbernstein>I was just saying that it seems that we have quite a number of tickets.
* manez leaves16:44
<westgard>Yes, there are a lot of them, but many of them represent requirements that will be met by a 1.0 release without additional work.
<dbernstein>oh - okay - I didn’t realize that.16:45
westgard: I’m hoping to get through the bagit piece by tomorrow. What do you think I should jump on next?
I was thinking https://jira.duraspace.org/browse/FCREPO-2307?src=confmacro16:46
<westgard>dbernstein: once there is an import/export tool with version support FCREPO-2456 would be important.16:48
<awoods>youn: Fedora does not care which resources point at others. coblej may have a more specific production scenario.
<westgard>It might be easy, or it might be hard. Depends on how different versions look on disk.16:49
dbernstein: regarding FCREPO-2307, I'm ambivalent. In practice, for migrations I see the use case for this, but on the other hand, you can already get this for all intents and purposes by verifying an export and then verifying the import.16:50
So I see verifying two fedoras as more of a convenience than a necessity.16:51
<dbernstein>okay - I’ll jump on 2456 next.
<westgard>Sounds good. I can try to focus on 2449 (handle all responses from Fedora)16:52
If we get those two sorted, plus bagit bags, then I think we're pretty close.16:53
2362 has already been analyzed pretty extensively and a couple issues fixed. The remaining issues haven't been reproducible.
* mikeAtUVa leaves16:54
<westgard>But there should be testing at the end of this sprint so that should confirm whether we can close 2362 or not.
<dbernstein>okay - sounds good, westgard
<dbernstein>I may be able to hit another issue before version support (FCREPO-2456) is merged. Are there any others I might work on in case I finish the bagit issue?16:57
<westgard>dbernstein: BTW I'll also create a ticket to update the config file logic for the latest version of import export.
<dbernstein>ie finish it this afternoon.
okay- maybe I can look at that one.
<westgard>OK so if you finish what you're on now there'll be this new ticket.
creating it now
I need to jump for a couple of hours. back later this afternoon.
* dbernstein leaves
* whikloj leaves17:01
* dbernstein joins17:02
* dbernstein leaves17:03
* benpennell leaves17:05
* github-ff joins17:07
[fcrepo-import-export] escowles created trailing-slash (+1 new commit): https://git.io/vHqz0
fcrepo-import-export/trailing-slash cebfc8e Esmé Cowles: Improving filename and URI mapping where both source and destination ending with trailing slashes
* github-ff leaves
* manez joins17:12
* manez leaves
* peichman leaves17:13
* coblej leaves17:32
* manez joins
* dwilcox leaves17:39
* umgrosscol leaves17:42
* dwilcox joins17:44
* dhlamb leaves17:46
* youn joins18:06
westgard: If there are any verification tool tickets for me to work on that would not be time better spent on testing import export or verify, please let me know.18:10
awoods, escowles: I think another way to traverse a repository would be by user-supplied predicates plus all arguments of those predicates.18:12
<westgard>youn: Thanks. dbernstein and I were just talking about https://jira.duraspace.org/browse/FCREPO-2473 which will be important if we want to get to a release that works with the current version of import/export
<youn>xpost -- Rather all arguments of predicates in the RDF of the resources that are the object (?) of the supplied predicate(s).18:14
* yamil leaves18:15
<youn>westgard: Okay, I will take that. If you have time, could you let me know how you ran rdflib directly on the files that triggered verification errors?18:16
<awoods>youn: although a good idea, the predicate/object combination is probably out of scope for this sprint.18:19
<youn>awoods: understood. For escowles' 10k, I think you'd get the objects with the members.18:22
<awoods>youn: actually, the objects of the triples are actual repository resources... so your scenario seems to be asking to traverse based on specific resources.18:23
<youn>Sorry for the confusion -- objects is the label for a node in 10k that contains binaries.18:24
Yes, specific resources that are related to objects of a supplied predicate. Those statements would not make sense unless the resources are there.18:30
<westgard>youn: the easiest way is with the python REPL. I'll post a gist.18:35
* youn leaves18:38
* ksclarke leaves18:44
<westgard>youn: https://gist.github.com/jwestgard/8bdfc61899f94127d373ee3540617df218:47
* ksclarke joins18:49
* westgard leaves18:50
* ksclarke leaves
* dwilcox leaves21:11
* umgrosscol joins21:28
* dhlamb joins21:44
* benpennell joins22:26
* benpennell leaves22:39
* lsitu leaves22:43
* dhlamb leaves23:12
* awoods leaves23:29
* benpennell joins23:32
* umgrosscol leaves23:45
* manez leaves23:53

Generated by Sualtam