I have some very long documents. They have overall topics that are fairly standard, but each document will emphasise the topics differently AND within those topics they will have different subtopics
I would like to determine 1. The importance/ probabilities of each topic within each document (i.e. document 1 put more emphasis on topic 3 than document 2 did) and 2. The subtopics and their probabilities of each topic.
I have mostly seen bertopic and top2vec for short text like tweets.
Would they be an appropriate strategy for very long documents? Is there a better strategy for very long documents?
You have to try them (and other classic methods like LDA) with your documents, and your goals, to evaluate their applicability. No external authority, having only a vague idea of what's available & important to your project, can give an a priori assessment of what will work, or be practical/optimal.
And, after you try various techniques and observe where they work or don't, and have a better idea of what you'd hoped for but was lacking, then you'll be able to ask more-detailed questions that could generate better insight.
Most topic-modeling options will offer a relative-score for each topic, by document. So yes, you'll have a sense of which documents are relatively-more associated with certain topics.
Many methods don't necessarily create hierarchical "sub-topics" of other higher-level topics, so if that's a requirement, it might require extra effort/steps.
If your documents are especially long, you may find it useful to split them into subdocuments, so that you get topic-analysis that's more sensitive to the full diversity of the documents, and can point to specific places where topics reside. Such splits would ideally match the document's own sections/chapters – but even a purely mechanical split may help you detect/characterize finer shifts in topic than a full-large-document analysis would reveal.