loading...

Is there a solid fork of .NET Core CLI without data collection?

kspeakman profile image Kasey Speakman ・2 min read

Just curious if anyone knows of one that is commonly accepted. My brief searches didn't turn up anything.

Currently we set the environment variable on all our machines to turn off dotnet core CLI data collection. (Even though this does not appear to stop it in some cases.) I have always hated data collection schemes, but given recent headlines, it seems only prudent to start taking more proactive measures.

Edit: It occurs to me that some might be unfamiliar with data collection in .NET Core CLI. So I will repost an explanation below.


Microsoft calls their data collection "telemetry", and it is in the dotnet core command-line tools. It is enabled by default, but you can opt out by setting an environment variable. It is not well-liked.

.NET core should not SPY on users by default #6145

ghost avatar
ghost posted on

@blackdwarf @piotrMSFT I am very disappointed to discover that .NET core comes with a hidden and enabled spy utility that reports on its users. (Lakshanf/issue2066/telemetry dotnet/cli#2145). Apparently, MS has learned nothing from the backclash against Windows 10 spying on users. I suspect many will not want to install .NET core for this reason, which is a shame because .NET core is otherwise cool.

It was announced as sortof a footnote of a release candidate build back in 2016.

The MSDN page about it -- as well as opt-out instructions -- can be found here. Although some of the wording there is understated.

Telemetry message wording is "inaccurate" #10262

Steps to reproduce

Just write dotnet in console There is a notification about using telemetry and how to disable it by using environment variable. Th message states that telemetry data is shared with community.

Actual behavior

The wording of the message is inaccurate and misleading as only "some" of the data "used to be" shared with the community. Looking at https://docs.microsoft.com/en-us/dotnet/core/tools/telemetry tells us that only 5 out of 13? Data Points "used to be" shared. Also, data blobs are not raw data, they are already preprocessed and are missing a lot of useful information that is gathered by telemetry but not exposed to the public such as time of invocation. What I mean by "used to be" is that last blob that is currently available comes from late 2017. Latest data is not publicly available (5 last quarters right now) even in this tiny aggregated form.

Suggested change

I don't expect more than - for the very least - change to be applied to the wording. Get rid of the sentence about sharing data with community. Alternatively change wording to take into account that only some portion of the gathered data was shared with the community.

EU GDPR Sidenote

According to this page https://ec.europa.eu/info/law/law-topic/data-protection/reform/what-personal-data_en gathered data, especially MAC addresses(even hashed obviously) makes it personal information which should be managed scrupulously.

ML.NET telemetry (Added day later)

mlnet cli tool uses this messaging: "The data is anonymous and doesn't include personal information or data from your datasets." which is also inaccurate and somewhat misleading. Calling this dataset as containing anonymous data and not containing personal information is false. Also I wasn't aware that mlnet cli is going to use telemetry. Therefore I disabled it only after I runned it for the first time with quickstart example. Basic mlnet command doesn't tell you about telemetry. That is a bit concerning.

Opinion

IMHO some variant of the 99% anonymised(as in nearly impossible to cross-check, correlate) telemetry makes sense for usage metrics... This data used by dotnet, mlnet is not so much anonymous. Sharing publicly only portion of the data leads to different conclusions about the planned usage of the data.

Note: It does not currently collect runtime data. Meaning, when you invoke a built app with dotnet MyApp.dll, it does not collect anything at this moment. It does record and report some data points as you use it for dev purposes (e.g. dotnet run) unless you opt out by setting an environment variable. It has been reported that some post-install scripts send off telemetry data even if you opted out. But afterwards, it should not.

Posted on by:

kspeakman profile

Kasey Speakman

@kspeakman

collector of ideas. no one of consequence.

Discussion

markdown guide
 

I disable telemetry from OS level, using hosts file with this tool:

github.com/Wohlstand/Destroy-Windo...

I haven't tried this, but it seems good too: github.com/10se1ucgo/DisableWinTra...

 

Do you know if this also affects dotnet core telemetry? I am unsure whether core's goes to the same locations that Windows 10 uses.

 

I really don't know for sure, but I think telemetry stuff are on the same servers...

BTW, you can check the servers IP addresses using tools like Fiddler (and then add them to your hosts file)

I also found that there is an opt-out env var:

docs.microsoft.com/en-us/dotnet/co...

gist.github.com/nathanchere/b32d92...

We use that env var already. Setting it is part of our dev setup instructions. But there is no guarantee that it will continue to work in the future.

The same goes to the IP addresses if you write them down, MS is changing those IPs every now and then (to keep their telemetry working).

Yes, and Windows updates seems to turn back on data collection that has been turned off.

 

This is the first I'm hearing about data collection in .net core... Can you provide some more info about this please? Do you mean in Visual Studio?

 

They call it telemetry, and it is in the dotnet core command-line tools. It is enabled by default, but you can opt out by setting an environment variable. It is not well-liked.

.NET core should not SPY on users by default #3093

ghost avatar
ghost commented on May 18, 2016

@blackdwarf @piotrMSFT I am very disappointed to discover that .NET core comes with a hidden and enabled spy utility that reports on its users. (Lakshanf/issue2066/telemetry #2145). Apparently, MS has learned nothing from the backclash against Windows 10 spying on users. I suspect many will not want to install .NET core for this reason, which is a shame because .NET core is otherwise cool.

It was announced as sortof a footnote of a release candidate build back in 2016.

The MSDN page about it -- as well as opt-out instructions -- can be found here. Although some of the wording there is understated.

Telemetry message wording is "inaccurate" #11311

arekbal avatar
arekbal commented on May 10, 2019

Steps to reproduce

Just write dotnet in console There is a notification about using telemetry and how to disable it by using environment variable. Th message states that telemetry data is shared with community.

Actual behavior

The wording of the message is inaccurate and misleading as only "some" of the data "used to be" shared with the community. Looking at docs.microsoft.com/en-us/dotnet/co... tells us that only 5 out of 13? Data Points "used to be" shared. Also, data blobs are not raw data, they are already preprocessed and are missing a lot of useful information that is gathered by telemetry but not exposed to the public such as time of invocation. What I mean by "used to be" is that last blob that is currently available comes from late 2017. Latest data is not publicly available (5 last quarters right now) even in this tiny aggregated form.

Suggested change

I don't expect more than - for the very least - change to be applied to the wording. Get rid of the sentence about sharing data with community. Alternatively change wording to take into account that only some portion of the gathered data was shared with the community.

EU GDPR Sidenote

According to this page ec.europa.eu/info/law/law-topic/da... gathered data, especially MAC addresses(even hashed obviously) makes it personal information which should be managed scrupulously.

ML.NET telemetry (Added day later)

mlnet cli tool uses this messaging: "The data is anonymous and doesn't include personal information or data from your datasets." which is also inaccurate and somewhat misleading. Calling this dataset as containing anonymous data and not containing personal information is false. Also I wasn't aware that mlnet cli is going to use telemetry. Therefore I disabled it only after I runned it for the first time with quickstart example. Basic mlnet command doesn't tell you about telemetry. That is a bit concerning.

Opinion

IMHO some variant of the 99% anonymised(as in nearly impossible to cross-check, correlate) telemetry makes sense for usage metrics... This data used by dotnet, mlnet is not so much anonymous. Sharing publicly only portion of the data leads to different conclusions about the planned usage of the data.

Note: It does not currently collect runtime data. Meaning, when you invoke a built app with dotnet MyApp.dll, it does not collect anything at this moment. It does record and report some data points as you use it for dev purposes (e.g. dotnet run) unless you opt out by setting an environment variable. It has been reported that some post-install scripts send off telemetry data even if you opted out. But afterwards, it should not.

 

Thank you for your very informative reply! Definitely gives food for thought... Those data points don't look too bad at least. For me, MAC address and maybe working folder could be closer to breaking anonymity than is ideal. Hopefully the intentions are truly to just make improvements to the framework. Makes sense to be looking for a fork with telemetry excluded though.

For me it is less about the data being collected and more the precedent of taking data from me without my explicit consent. I think the people in charge now are probably using it in good faith, but it sets a bad precedent for future executives in the same role. People come and go. There are strong business incentives to take more data over time, making collection non-optional, etc. For some particularly frightening reading, look up eBay vs Newmark. (eBay as a shareholder of craigslist, sued craigslist) Wherein the judge ruled that it is the responsibility of for-profit corp execs to maximize profit for shareholders, regardless of social good. I wish I were joking. As such, I believe this data collection door will only be opened wider over time as long as money is left on the table.

You make some good points. I agree with you on the opt-in, this should be asked when it gets installed perhaps. I guess asking for forgiveness rather than permission gets them more data. Which, as we've established, isn't so bad now but who knows what it will become in the future. Really doesn't look great on Microsoft from a public standpoint - if you're suspicious about them because of the past, it's something that would cause you to stay so.

That being said though, I'm far more annoyed about what my phone is sending out to a cloud somewhere than what a small subset of my dev environment is.

That being said though, I'm far more annoyed about what my phone is sending out to a cloud somewhere than what a small subset of my dev environment is.

Oh, for sure. I haven't mentioned that here because we don't issue cell phones at work. But the dotnet issue does intersect with "dev" for us.

At home, I am looking into other solutions for cell phones. There is one coming out soon that is supposed to be user-loyal (as opposed to manufacturer-loyal) and linux-based. Librem 5. I'll probably wait until it supports my particular carrier well enough (currently lacks GSM LTE band 12 which supposedly will make my signal bad).

Librium looks very cool! Hopefully it grows into something bigger and doesn't become a fringe case company that ends up going under due to lack of attention. Not even Microsoft, with their resources, could get into the smartphone market. People like their widgets and Apple and Android have a lot.

I hope so too. The company has already been selling laptops for a bit now, so it does not seem like they are in danger of failing to launch. However their products will most likely remain niche so long as we (society) accept giving away privacy in exchange for "free" apps. Who knows, they might bring something unforeseen to the market that will make people want to use them. Here's hoping. I plan to give them a try, at least to support the idea.

 

Updated title to clarify that the data collection is on the CLI tools, not built into dotnet core libs.