16

Concurrent File System Scans

  • Released

Avatar
sabin1001

Enhance SpaceObServer scan service to execute scans concurrently, with the goal to make scanning of large file system trees faster.

A

Activity Newest / Oldest

Avatar

Team SpaceObServer

This feature has now been introduced with SpaceObServer version 7.2


Avatar

Team SpaceObServer

Status changed to: Released

Avatar

Team SpaceObServer

Please note that we are planning to implement this in the upcoming version 7.2


Avatar

Team SpaceObServer

Status changed to: Planned

M

marc thuijs

We also have many large filesystems of up to 100+TB in size and 10's of millions of files. Largely HPC data. I have similar experiences as expressed here, but I also find it difficult to scale at the database end. Adding more scanning servers seems to move the bottleneck to the database. So then you need to add more database instances &/or servers and some logic to allocating scan locations to scanning servers and database instances. Gets messy quickly. This may not be the appropriate forum, but I would be interested in know/hearing how others approach these issues.


G

George Wagner

I also have a number of huge locations to scan and report on. Am I the only one that has to wait overnight to export data from the scans after they complete? I need a list of folders and sizes which are actually displayed in the grid already-- but SpaceObserver seems to want to calcluate the sizes again and that takes many hours and often crashes for me.


Avatar

Jeffrey Beard

Yes, we have about 41 million files.


N

Nathan

Really looking forward to this, we have a NAS location with well over 1 billion files, to do an initial scan would take months, so we had to split up the location into 100's of separate scans and deploy multiple Space Ob Servers just so we can do scans in parallel.

I would say the following things need to change:

1. Allow a single Space Ob Server to execute multiple parallel scans, having to deploy multiple VM's/ Space Ob Server's to do this is a waste of resources in our case.

2. Multithread the initial scans, something like a dir listing at the root and then multiple threads can split off to scan each subdirectory and so on.

3. Increase multithreading of scans after initial, I think the current limit is 32


Avatar

Team SpaceObServer

Thanks for your post. We already plan to launch this feature with one of our next releases.


  • N