StarWind iSCSI SAN
StarWind Native SAN for Hyper-V
 

VSAN Free Crashed

Pure software-based VM-centric and flash-friendly VM storage (iSCSI, SMB3, NFS, iSER and NVMe over Fabrics) including free version. Same software embedded on StarWind ready nodes.

Moderators: art (staff), anton (staff), Anatoly (staff), Max (staff)

VSAN Free Crashed

Postby DWHITTRED » Mon May 04, 2020 12:29 am

Hi,
I have been evaluating the Virtual SAN Free and last night it unexpectedly crashed and recorded this in the log:
Code: Select all
5/4 3:36:32.777196 634 error: Sp: *** CStarPackCoreNew::CStarPackCore::FlushingThread (8807) (0x000002668C020000) pNextContext is NULL
5/4 3:36:32.777326 634 error: Sp: *** CStarPackCoreNew::CStarPackCore::FlushingThread (8772) (0x000002668C020000) pNextContext is NULL
5/4 3:36:32.777338 634 error: Sp: *** CStarPackCoreNew::CStarPackCore::InternalFlushEx (8532) (0x000002668C020000) pContext is NULL
5/4 3:36:32.824361 634 debug: *** _miniDumpFilter: The program encountered a serious error and may be closed. Crash dump will be created.
Please, save the log file and the crash dump and report the problem to [email protected]
5/4 3:47:44.989310 634 debug: *** _miniDumpFilter: Minidump 'C:\Program Files\StarWind Software\StarWind\starwind.20200504.033632829.mdmp' created successfully.
5/4 3:47:45.086960 634 error: Sp: xxx CStarPackCoreNew::CStarPackManager::Stop (772) Destroying CStarPackThread object at 0x00000269112FB450
5/4 3:47:45.087033 634 error: Sp: xxx CStarPackCoreNew::CStarPackManager::Stop (772) Destroying CStarPackThread object at 0x00000269112FB870
5/4 3:47:45.087054 634 error: Sp: xxx CStarPackCoreNew::CStarPackManager::Stop (772) Destroying CStarPackThread object at 0x00000269112FB5F0
5/4 3:47:45.087070 634 error: Sp: xxx CStarPackCoreNew::CStarPackManager::Stop (772) Destroying CStarPackThread object at 0x00000269112FB610

Could you help me understand why this happened? This also happened about two months ago.

I have use the Starwind log collector before restarting the service, so I should have copies of any information you may need.
DWHITTRED
 
Posts: 11
Joined: Sun Dec 01, 2019 11:38 pm

Re: VSAN Free Crashed

Postby yaroslav (staff) » Mon May 04, 2020 10:38 am

Hi DWHITTRED,

I am afraid that a tiny portion of StarWind Service log is not enough; dump file contains the key.
Could you please share full logs with me? Please use StarWind Log Collector for that purpose https://knowledgebase.starwindsoftware. ... collector/.
Also, would be happy to see the dump file.
Use Google Drive to share the logs (they might be too large to be posted at the forum).
yaroslav (staff)
Staff
 
Posts: 218
Joined: Mon Nov 18, 2019 11:11 am

Re: VSAN Free Crashed

Postby DWHITTRED » Sun May 17, 2020 1:42 am

Hi,

Apologies for not getting back to you - I kinda expected a notification through my emails and forgot to check the forum (sorry!)

Below is a link to my logs from the StarWind Log Collector:

https://drive.google.com/file/d/1KRZe91MCaCig2LScjWkdQBEycmbCRFe1/view?usp=sharing

I haven't uploaded the dump file yet - its 6.3GB and will take a long time to upload. Can you confirm if you need a copy of the dump file before I start that upload process?

Kind regards,
Daniel
DWHITTRED
 
Posts: 11
Joined: Sun Dec 01, 2019 11:38 pm

Re: VSAN Free Crashed

Postby yaroslav (staff) » Sun May 17, 2020 9:23 am

Greetings,

The logs you provided me with have only StarWind VSAN service logs. StarWind Log Collector collects system logs, application logs, info on StarWind HA devices, and networking connection. It is way more informative than StarWind VSAN logs alone.
Yes, we need minidump because if service crashes all the useful info is in the minidump.
yaroslav (staff)
Staff
 
Posts: 218
Joined: Mon Nov 18, 2019 11:11 am

Re: VSAN Free Crashed

Postby yaroslav (staff) » Wed May 20, 2020 12:04 pm

Hi,

Done with log investigation.
Let me clarify what has caused the issue.
5/4 3:36:32.777196 634 error: Sp: *** CStarPackCoreNew::CStarPackCore::FlushingThread (8807) (0x000002668C020000) pNextContext is NULL
5/4 3:36:32.777326 634 error: Sp: *** CStarPackCoreNew::CStarPackCore::FlushingThread (8772) (0x000002668C020000) pNextContext is NULL

Events indicate that there are not enough CPUs assigned. Try using 8 CPUs (4 vCPUs in 2 sockets). Also, consider using at least 8GB of RAM

I have also noticed that you used LSFS. Please, note that we do not recommend LSFS for any production use due to the intense growth of LSFS devices. LSFS container description is here https://knowledgebase.starwindsoftware. ... scription/.
Please consider updating to the latest build. Download it at https://www.starwindsoftware.com/tmplin ... ind-v8.exe the update procedure can be found at https://knowledgebase.starwindsoftware. ... d-version/.

Please uninstall the following components: StarWind VSS Provider, StarWind Cluster Service, and StarWind SMI-S Agent. They are not needed.
Uninstall StarWind SoftwareVSS Provider
cd "C:\Program Files\StarWind Software\StarWind\VSS"
stop_.bat

Uninstall SMI-S Agent – run commnads below from CMD:
cd "C:\Program Files\StarWind Software\StarWind\OpenPegasus\bin\"
ConfiguratorConsole.exe" --stop --name StarWindSMISAgent
cd"C:\Program Files\StarWind Software\StarWind\OpenPegasus\bin\"
ConfiguratorConsole.exe" --uninstall --name StarWindSMISAgent

Uninstall StarWind Cluster service – run commands below from CMD:
cd C:\Windows\Microsoft.NET\Framework\v4.0.30319
installutil.exe /u "C:\Program Files\StarWind Software\StarWind\StarWindCluster\StarWind.ClusterService.exe"

I have escalated this issue to R&D team. Will let you know if I learn anything interesting from them.
yaroslav (staff)
Staff
 
Posts: 218
Joined: Mon Nov 18, 2019 11:11 am

Re: VSAN Free Crashed

Postby DWHITTRED » Sun May 24, 2020 11:07 am

Hi,

Thankyou for the information. I have additional questions.

yaroslav (staff) wrote:Events indicate that there are not enough CPUs assigned. Try using 8 CPUs (4 vCPUs in 2 sockets). Also, consider using at least 8GB of RAM

This is a physical machine, so I cannot provision additional vCPUs. This physical machine is only used as a storage device and doesn't run other services. It also has 32GB of RAM and I have sized my LSFS device based on your RAM requirements for inline de-duplication. When looking at the historical performance data the machine does not seem to see heavy CPU utilisation even when I am maxing out my iSCSI network speed - for example in the last 24 hours the CPU did not exceed 58% (from the Starwind Management Console performance graph). Could you recommend some performance benchmarks that I need to meet for this to not fail? for example, a minimum CPU synthetic PassMark score from https://www.cpubenchmark.net/.


yaroslav (staff) wrote:I have also noticed that you used LSFS. Please, note that we do not recommend LSFS for any production use due to the intense growth of LSFS devices.

This is in contradiction to Starwind's published information on LSFS. In your existing LSFS documentation (going back to 2014) you state that it is the ideal storage device for virtualised workloads. How can this be if you do not recommend its use? Is your published information incorrect? Can you please confirm this statement because I find it very concerning to be told that this is not recommended after I have followed your advertising, whitepapers, and documentation.


yaroslav (staff) wrote:Please consider updating to the latest build.

After this failure I upgraded to the latest build as of May 4th as part of my troubleshooting. I will update to the latest version as of May 6th.


yaroslav (staff) wrote:Please uninstall the following components: StarWind VSS Provider, StarWind Cluster Service, and StarWind SMI-S Agent. They are not needed.

Thankyou for pointing this out, I will remove these components as requested.
DWHITTRED
 
Posts: 11
Joined: Sun Dec 01, 2019 11:38 pm

Re: VSAN Free Crashed

Postby yaroslav (staff) » Mon May 25, 2020 3:02 pm

Could you recommend some performance benchmarks that I need to meet for this to not fail? for example, a minimum CPU synthetic PassMark score from https://www.cpubenchmark.net/.

You can use any convenient tool that is recommended by community.

How can this be if you do not recommend its use?

Do not get me wrong. You can use it for production, however, please make sure that the environment meets the recommendations of StarWind mentioned at https://knowledgebase.starwindsoftware. ... scription/. Please be aware of storage consumption: LSFS files can occupy 3 times more space compared to initial LSFS size. Snapshots require additional space to store them.

Please let us know if there is anything else I can assist you with.
yaroslav (staff)
Staff
 
Posts: 218
Joined: Mon Nov 18, 2019 11:11 am

Re: VSAN Free Crashed

Postby DWHITTRED » Wed May 27, 2020 12:50 am

Hi,

yaroslav (staff) wrote:You can use any convenient tool that is recommended by community.
I am not following. I asked what level of performance I need to meet.

yaroslav (staff) wrote:Events indicate that there are not enough CPUs assigned.
What you have said here is that I need more CPU - what I am trying to find out is how much more CPU? What is the minimum clockspeed, or core count, or performance benchmark that I need to meet?

yaroslav (staff) wrote:Do not get me wrong. You can use it for production
Thank you for the confirmation, however in your previous post you literally said not to use it in production:
yaroslav (staff) wrote:Please, note that we do not recommend LSFS for any production use
You understand why I am confused?

Regards,
Daniel
DWHITTRED
 
Posts: 11
Joined: Sun Dec 01, 2019 11:38 pm


Return to StarWind Virtual SAN / StarWind Virtual SAN Free / StarWind HyperConverged Appliance / StarWind Storage Appliance

Who is online

Users browsing this forum: Google [Bot], Majestic-12 [Bot] and 2 guests