Skip to Content Skip to Navigation
Profile image for Osma Ahvenlampi

Osma Ahvenlampi

@[email protected]

I'm more active on @[email protected] though that might change later. This account I expect to eventually hold archives from my previous accounts on Twitter etc.

Systems, organizations, products, platforms, software, science, and a little bit of politics. Whatever you think I identify with, I probably don't.

114 Posts Posts & Replies 100 Following 1 Follower Search

I'm not surprised that my more active account on a big server doesn't see spam, because it has good admin. It is a bit curious that I haven't seen any here, either. I guess it's all very Mastodon-specific.

I believe I figured out how to import old posts into this account without causing it to flood the fediverse with them. They're now in my local profile and should be browsable by other servers - as well as indexed by them, if they ever see those posts eg due to a boost by someone.

Sadly, as part of this process I (half-accidentally) lost all the previous followers.

It's now been two weeks since Finland closed the majority of our russia border crossing points and a week since the last one of the border points closed. This situation is going to last for another week - this closure will end on the 14th, until new decisions extend it before that.

But what's the bigger picture here? Is this a repeat of the situation experienced on the belarusian border to Lithuania, Poland and Latvia in 2021? What of the last migration influx to Finland back in 2015-2016? 1/13

@osma Test reply from another account - one which I'll try to moderate away.

Tammikuussa 2022 löysin lintusomessa oheisen @jonlevy säikeen, jonka käänsin suomeksi. Tämä säie sisältää tuon arkistoista kaivamani suomennoksen.

In January 2022 I found this @jonlevy thread on the birdsite which I translated to Finnish. Here is a replica of it, since I've deleted the original along with all of my other content.
twitter.com/jonlevyBU/status/1

I see this @arstechnica article is trending. It's inaccurate in a problematic way, though.

"Port 5353/UDP" is not just some "field" - it's a whole another service discovery protocol by Multicast DNS. In this case, your iPhone looking for AirPlay targets after making a new network connection.

The reason this matters is that Wifi privacy was presented as a way to hide your device identity from pass-by Wifi networks you didn't even connect to, and it does that. Revealing your device id after you've connected isn't great, but easy to mitigate: just don't connect your devices to random public Wifis. There are much worse problems you're exposing yourself to if you do that.

arstechnica.com/security/2023/

Why do we have solar eclipses? Because the Moon happens to be 1/400th the diameter of the Sun, while also at 1/400th the distance from us. But how come such a coincidence?

If the Moon was much larger or closer, the tides would be quite disruptive. If it was much smaller or distant, it wouldn't be a good asteroid shield. Either might have prevented our emergence.

It's a coincidence Earth has such a Moon, but it may not be such a coincidence we're here to see it.

@YetAnotherGeekGuy sorry for delay in reply. This server, which is a very small single-user instance, knows about 5128 other fedi domains, a small fraction of the whole. Its average fetch time for a profile update from one of those other servers is 7 seconds, p99 27 seconds. The *scatter* takes too long, never mind the gather. And for the effort it takes to maintain (local? distributed?) bloom filters, you might as well maintain the index itself, avoid needing to poll hundreds (likely, thousands) servers in a second pass. No one wants to wait half a minute to have their search complete.

Almost completely random, but I can't not share this article of a thought to be extinct bird from a fedi server powered by software named for that bird. @[email protected]
www.theguardian.com/environmen

@YetAnotherGeekGuy This may be an unpopular opinion, but for search, I don't think a centralized index is a bad solution. When Google was *search*, it was great. It only turned to shit when they replaced search with AdWords everywhere. That said, two alternatives, in order of more complexity for additional decentralization:

- a number of search-as-service operators, to which server admins can relay messages and offload index maintenance for amortized cost
- federated partial indices, in which co-operating servers each index a part of the whole and searches are distributed

Federating everything everywhere, that is each server indexing its own content and returning results to all searches, would not work. The thundering herds and search lag would be horrible.

Tämä ei ole ainoastaan Yhdysvaltain presidentinvaalien ongelma. Myös suomalainen demokratia tulee saamaan tästä osansa, kun disinformaatio ja salaliittoteoriat leviävät some-alustoilla. Ja on täälläkin pressanvaalit tulossa!

EU:n DSA astui voimaan viime viikolla ja yksi sen seurauksista on, että nuo alustat ovat luvanneet tehdä sisältösuositusalgoritmeistaan valinnaisia. Ne eivät kuitenkaan ole luvanneet niistä läpinäkyviä, vaan joko/tai.

www.washingtonpost.com/technol

@malte So you're seeing a meaningful difference between the view from a "tiny inactive server" vs something larger, but not necessarily something larger vs the whole fediverse. A matter of grade, then.

As it is today, we can't even say how large is the section of fedi its biggest individual server, mastodon.social (1.5M users out of 10.5M, according to fedidb.org) does not see. Could be marginal (assuming most active users get federated over there), could be very large (if significant sub-clusters not connected to mastodon.social exist - and some do, because mastodon.social is blocked by some servers).

@drahardja @AdeptVeritatis Sure. There's a big difference between finding content "from the outside", as it were, and finding it "inside".

We don't have to imagine content on Twitter not found on Google - that's already in play. But even when it was found, finding content specifically on Twitter was better approached by searching on Twitter - it had metadata Google didn't index.

That's what I expect from a good fedi search tool, as well.

Firefish antennae are a good example. That, but over the entire fedi, in real time, and with more query expression power.

@drahardja @AdeptVeritatis That latter is pretty much what I believe will happen - search-as-a-service servers can subscribe to, in order to reduce their own infrastructure demands while getting better search functionality and coverage. The only question here are the terms of the search availability - a paid service would be the most obvious fit to the current model of fedi services.

@drahardja @AdeptVeritatis If there's a viable model for one search service, there will be several trying for it, because this is not a monopoly.

Personally, I'd rather see them be someone else than Google, Meta and Microsoft.

@drahardja @AdeptVeritatis Let me try to rephrase what I understood you asked:

Let me ask other servers across the fedi to execute a search I make locally and send their results back.

That right?

VERY hard to implement without opening significant denial of service exploit risks. Heavy queries from foreign would bog down servers, and queries with massive results could be used to spam a server.

@kallekn @osma I haven't checked how Firefish deals with opt-in today - my expectation is that it does what the Vyr/Universeodon search fork, and Eugen's original proposed implementation do, that is federated timeline discoverability would have been enough to enable indexing. Now that Mastodon will be introducing a separate indexable opt-in metadata, I'd expect Firefish and others to switch to that model, too. So, yes, more or less the same.

I don't know if there are any plans for antennae-like search extensions on Mastodon. As you say, if you want them, Firefish exists already.

The global search I'm thinking of is another thing. That's a function of how much the index sees (point 1 above) and how far back in time the data is retained. Search on the scale Twitter used to present is not cheap to build or maintain.

@jwildeboer @glynmoody Mastodon 4.2 also, by default, will block GPTBot from indexing it. Assuming data collectors follow ethical privacy practices, the opt-in model will be there. If you assume ethics will be violated, well, you're posting stuff with public visibility on the Internet, and access to it certainly isn't (technically) prohibitively expensive.

@jwildeboer @glynmoody I happen to think that having search services available made Internet much more valuable than not having them, and I also am frustrated when search does not present content which I know exists, but the index is unable to locate. So, yeah, a search service which actually finds shit is, I think, a fairly desirable feature.

If you think you can run one on social.wildeboer.net for yourself, well, good luck with that effort! I know I won't be able to do it for this server.

@kallekn @osma Three parts:

1. Both Firefish and Mastodon get their feed the same way: ActivityPub server push messages. Each server sees only that content which is federated to them, or which they explicitly fetch from origin server (due to reference in another toot, typically).

2. Search is implemented as an ElasticSearch (or similar) index over the public toots. Mastodon has had this index before, but not show toots from it to your search unless they were your own, or you were mentioned, or had engaged with the toot before. That's changing now.

3. Firefish antennae are prebuilt, recurring search subscriptions. Mastodon's closest equivalent is subscribed hashtags. Antennae are conceptually similar, but allow subscribing to a more complex query than a single hashtag.

@OldGitPhil Are you the admin of mastodon.org.uk? If not, you won't be able to block it. You can choose to not search yourself and leave the indexable opt-in unchecked, which will leave your content off the index of ethical parties, but you can't prevent your toots from being indexed by less scrupulous data grabbers - which certainly already exist.

@drahardja @AdeptVeritatis This was my question. There will be a global search provider. It will be either an existing, ad-funded one which will not only index the toots, but use your search interests to build a profile, too, a new ad-funded one which will do the same, but compete against the old one(s), or one (/several) which funds their operations from some other source of revenues.

Sounds like you prefer the former.

@ianbetteridge On individual basis, understandable, but also not viable as a service business model.

Thus that leads to one of only two possible outcomes:

- ad-funded service, with all the downsides we've grown to expect of that model
- information exclusivity to only those who subscribe to a premium service

Well, realistically, we'll see both. Most of fedi has already been indexed by any number of private data plays, while a standardized opt-in model will make it politically acceptable to open an ad-funded search engine. Perhaps Google - but I'd hope for someone else, just for competition's sake.

@ben There have been several opt-in attempts to do this as well. Those were driven off precisely because of disagreements over what constituted opt-in - many argued that the combo Eugen proposed for Mastodon 4.2 (and since changed due to feedback) would be sufficient: if a profile is discoverable (= visible to Federated timeline) and the toot is public, it could be indexed.

Crucially, third party projects could do no better, because that was the only metadata available. Mastodon 4.2 adds a new profile attribute (but sadly does not deprecate the old one - so privacy settings are increasingly oblique).

Mastodon 4.2 release will be a major milestone for all of fedi. When the biggest server flavor enables toot search and settles on a privacy declaration to make it opt-in, that settles the argument on whether global search is okay here or not: it will be.

In the past year, a lot of projects to do this exact thing have been killed and people banished for it, but that's the price for spearheading change.

However, no Mastodon server sees enough of fedi to actually provide global search - not even mastodon.social, and certainly not whichever site you are registered on. They can only index the part they see, after all.

So, the next step on this path is a global search service provider subscribing to ALL servers (modulo moderation). Running one will be a very expensive operation, though - funded either by ads, or by (server admin?) subscription.

Would you pay to have access to global fedi search?