Row-Based Storage vs Columnar Storage: SQL Server Tables vs Parquet Files is one of those topics that sounds academic until it slows a report, inflates storage costs, or blocks a data project. The ...
A while ago, I was asked by a former colleague about the best way to convert Parquet files into comma-separated values (CSV) format using Python. The honest answer? It depends. And so on and so on ...
Since Github doesn't provide a great way for you to learn about new releases and features, don't just star the repo, join the mailing list. dsq will likely work on other platforms that Go is ported to ...
Part of the SQL Server 2022 blog series. Microsoft SQL Server 2022 introduces the newest version of PolyBase, and with it the capability to query data where it lives, virtualize data, and use REST ...
Databricks Lakehouse Platform combines cost-effective data storage with machine learning and data analytics, and it's available on AWS, Azure, and GCP. Could it be an affordable alternative for your ...
In two of my previous articles ( check them out here and here) I talked about the rationale behind converting text files to compressed columnar file formats such as Parquet when dealing with big data.