Difference between revisions of "2017 Winter Project Week/IPFS NoSQL Combination"

From NAMIC Wiki
Jump to: navigation, search
(add pip command)
 
(5 intermediate revisions by 2 users not shown)
Line 8: Line 8:
 
* Hans Meine (University of Bremen, Fraunhofer MEVIS = FME)
 
* Hans Meine (University of Bremen, Fraunhofer MEVIS = FME)
 
* Steve Pieper (Isomics)
 
* Steve Pieper (Isomics)
 +
* Satra Ghosh
  
 
==Project Description==
 
==Project Description==
Line 33: Line 34:
 
** 0.4.4 [https://github.com/ipfs/go-ipfs/issues/3250 does not play well with up-to-date FUSE for macOS yet]
 
** 0.4.4 [https://github.com/ipfs/go-ipfs/issues/3250 does not play well with up-to-date FUSE for macOS yet]
 
** performance when listing directories not available on current machine is not good (several seconds delay)
 
** performance when listing directories not available on current machine is not good (several seconds delay)
 +
* Performed several experiments that showed disappointing performance (maybe MIT wifi related?), but eventually successful transfers
 +
* QmPyXW927iBPHVk3hfyzXAPGDpup26WGEh4LYK6da2xMhA is a TCGA subdirectory transferred to several project week participants' computers
 +
* Discussed interesting deduplication idea: Can original DICOM, anonymized DICOM and e.g. Nifty files share data blocks?
 +
** Answer: Yes, there's a <code>--chunker</code> argument to <code>ipfs add</code> which [https://github.com/ipfs/faq/issues/214 defines the chunking algorithm]. Apparently, the rabin chunker should already perform well, without any particular knowledge about the file formats, but it does not seem to be the default algorithm. (Valid arguments include: 'rabin' 'rabin-[avg]' or 'rabin-[min]-[avg]-[max]' with integer parameters.)
 +
** Also pay attention to development of IPLD (for example: [https://github.com/ipld/specs/tree/master/ipld#format-definition look here])
 
|}
 
|}
  

Latest revision as of 14:56, 13 January 2017

Home < 2017 Winter Project Week < IPFS NoSQL Combination

Key Investigators

  • Hans Meine (University of Bremen, Fraunhofer MEVIS = FME)
  • Steve Pieper (Isomics)
  • Satra Ghosh

Project Description

Objective Approach and Plan Progress and Next Steps
  • Evaluate IPFS / NoSQL combination for MIC databases
  • Evaluate IPFS' PSK feature for "private clouds"
  • Build prototype for scanning images
    • put images / files into IPFS
    • put metadata into NoSQL database (ElasticSearch is what we used at FME, CouchDB is what Steve used in Chronicle)
  • Build prototype for browsing / showing images
    • should update live when images appear in the DB
    • should fetch image data from IPFS
  • IPFS stability and status
  • Performed several experiments that showed disappointing performance (maybe MIT wifi related?), but eventually successful transfers
  • QmPyXW927iBPHVk3hfyzXAPGDpup26WGEh4LYK6da2xMhA is a TCGA subdirectory transferred to several project week participants' computers
  • Discussed interesting deduplication idea: Can original DICOM, anonymized DICOM and e.g. Nifty files share data blocks?
    • Answer: Yes, there's a --chunker argument to ipfs add which defines the chunking algorithm. Apparently, the rabin chunker should already perform well, without any particular knowledge about the file formats, but it does not seem to be the default algorithm. (Valid arguments include: 'rabin' 'rabin-[avg]' or 'rabin-[min]-[avg]-[max]' with integer parameters.)
    • Also pay attention to development of IPLD (for example: look here)

Background and References