Privacy, Protection, Obscurity

86 Views Asked by At

I'm working on a project with a few interesting requests from the client. Wondering if anyone has any suggestions or would like to weigh in.

I'm creating a Web App for my client to distribute internally to a handful of people in the company. They need to be able to access the Web App from anywhere in the world.

The client requested that their users NOT have to log in to use the app, but they don't want everyone in the world to access it. It made me think of Google Apps sharing feature where you can create a link and whomever has the link can access the file. Security by obscurity basically. So I thought I can create a URL that is basically 50 or 60 random alpha-numeric characters long which the client can then share with their team.

I'm wondering if this is the best idea or if there is a better way to go about it.

Also, to ensure the pages won't be crawled, I was planning on using a robots.txt file that blocks everything.

Lastly, we want to collect analytics using Google Analytics so they can get a better idea of how their staff is using the App. It's my understanding that there is no direct connection between Google Analytics and Google Search in the sense that as long as I block the Google Search bot using robots.txt, anything captured in Google Analytics won't be automatically indexed. I want to make sure this is the case. That nothing sent to Google Analytics will automatically show up in a public forum.

The key to this whole thing is to be able to create a workflow that assures the client, that as long as the link doesn't get out into the public, their content is "protected." Or at least, as protected as unprotected content can be.

Any thoughts?

Thanks, Howie

2

There are 2 best solutions below

3
On

I would very carefully read the ToS of both GA and Google's general ToS and policies. Perhaps their spiders will ignore it, perhaps they won't. You can bet your arse that even if they don't directly put it in a google SERP, they will still keep a record of your URL somewhere. And who's to say tomorrow their policy won't change?

And what about if someone emails the link to someone else? It will almost certainly be fair game within gmail. Gmail most definitely has within their ToS that they can scan your emails and use that for targeted advertising. Which means who knows who that link will get passed to. Or any other email or IM or communication service.

Also, bottom line is that a crawler does not have to respect robots.txt, and there are many bots out there that don't. And all it takes is one to find it and put it out there somewhere on a site's page, publicly accessible file, etc.. that has nothing about robots.txt and sooner or later it may find itself on google's (or bing, etc..) serp without directly grabbing it off your site.

If it were me, I would absolutely 150% tell them bottom line is you can't even begin to guarantee the privacy of the API without it being behind some kind of authentication barrier.

edit: What about the prospect of making your own client, like a java app to be installed on their phone/computer? It could still use the internet and your server-side script for the backend but the interface would be accessed through the standalone app. That way someone would have to actually have the app to use it, and the public URL (actually I'd use sockets) used to communicate with your server would be a lot more defendable.

3
On

Additional suggestions:

  • Magical links authorize machines (via cookie, whatever you can use to hold a persistent identification), and then become invalid. This would improve your security immensely, at the cost of clients having to use a one-time (if you wrote it correctly) hoop per device.
  • I do not believe google will go out and crawl things just because you used analytics.
  • There exist search engines that ignore robots.txt, but they probably won't try to brute-force something like that.

Can you really not ever have sign-ins, at any point? Having a significantly more secure "remember me always" sort of system where you only need to sign in if something breaks is much, much better than the alternative.